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Abstract 

In this paper, we present a new approach for the analysis of iterative node-based verification-based (NB-VB) 
recovery algorithms in the context of compressive sensing. These algorithms are particularly interesting due to 
their low complexity (linear in the signal dimension n). The asymptotic analysis predicts the fraction of unverified 
signal elements at each iteration I in the asymptotic regime where n — > oo. The analysis is similar in nature to the 
well-known density evolution technique commonly used to analyze iterative decoding algorithms. To perform the 
analysis, a message-passing interpretation of NB-VB algorithms is provided. This interpretation lacks the extrinsic 
nature of standard message-passing algorithms to which density evolution is usually applied. This requires a number 
of non-trivial modifications in the analysis. The analysis tracks the average performance of the recovery algorithms 
over the ensembles of input signals and sensing matrices as a function of t. Concentration results are devised to 
demonstrate that the performance of the recovery algorithms applied to any choice of the input signal over any 
realization of the sensing matrix follows the deterministic results of the analysis closely. Simulation results are also 
provided which demonstrate that the proposed asymptotic analysis matches the performance of recovery algorithms 
for large but finite values of n. Compared to the existing technique for the analysis of NB-VB algorithms, which 
is based on numerically solving a large system of coupled differential equations, the proposed method is much 
simpler and more accurate. 

Index Terms- Density evolution, compressive sensing, iterative recovery algorithms, sparse sensing matrix, sparse 
graphs, message-passing algorithms. 

I. Introduction 

Compressed sensing was introduced with the idea to represent a signal ))6l" with k nonzero elements 
with measurements c 6 IR m , where k < m <^ n, and yet to be able to recover back the original 
signal v [[TJ, [|2j. In the measuring process, also referred to as encoding, signal elements are mapped to 
measurements through a linear transformation represented by the matrix multiplication c = Gv, where 
the matrix G G R mxn is referred to as the sensing matrix. This linear mapping can also be characterized 
by a bipartite graph [3], referred to as the sensing graph. 
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In the recovery process, also referred to as decoding, based on the knowledge of the measurements and 
the sensing matrix, we estimate the original signal. Decoding process is successful if v is estimated 
correctly. Three performance measures namely, density ratio 7 = k/n, compression ratio r c = m/n, and 
oversampling ratio r a = m/k are used in order to measure and compare the performance of the recovery 
algorithms in the context of compressed sensing [j] 

The sensing matrix in compressive sensing can be either dense or sparse. A sensing matrix is considered 
dense if it has few, if not none, zero entries. Sparse matrices, on the other hand, have few nonzero entries 
in each row and column. One major difference between these two types of matrices is the encoding 
complexity associated with each class. For sparse matrices, the number of operations needed to calculate 
the measurements is considerably lower than the one needed for dense matrices. 

Decoding algorithms can be classified based on the class of sensing matrix they use. The decoding 
algorithms in each class have certain properties in common, which follows. (For a comprehensive study 
on the topic we refer the interested readers to [|7J.) Decoding algorithms associated with dense matrices 
have, generally, high complexity (between 0(n 2 ) and 0(n 3 )) compared to the lower complexity of 
algorithms utilizing sparse matrices (between 0(n) and 0(n 2 )). To have a better feeling about the 



complexity and running time of these two classes of algorithms, we have included (Section |VH[ Fig. 
[3]) the comparison between two standard recovery algorithms for dense matrices (£1 minimization and 
weighted t\ minimization) and one algorithm (SBB) for sparse matrices. As can be seen, the decoding 
algorithm on sparse matrices is faster by about two orders of magnitude. Decoding algorithms for dense 
matrices are mostly based on linear or convex programming [[TJ, Q, [[8j, ||9j. The reason is that random 



dense matrices satisfy restricted isometry property (RIP) with overwhelming probability [10|, |11|. The 



RIP was introduced by Candes and Tao in [12| as the main restriction on the sensing matrix so that the 
recovery based on linear programming will be able to successfully recover the signal. Sparse matrices, 
on the other hand, do not satisfy RIP unless m = Q(k 2 ) |l3|j^]ln fact, many of the decoders based on 
sparse sensing matrices are iterative [|3j, p4|-[|26|. Although more computationally complex, decoding 
algorithms on dense matrices tend to recover signals with larger number of nonzero elements (higher 
density ratio) compared to decoders on sparse matrices. Nevertheless, the high complexity of decoding 
algorithms on dense matrices hinders their application to high-dimensional signal recovery (signals with 
large n). 

Focusing on recovery algorithms based on sparse graphs, we can further divide them into two major groups. 
In one group, we have algorithms that use group testing and similar techniques from estimation theory 

'For successful decoding clearly we need r a > 1. It is desirable to have this parameter as small as possible. Indeed, in (4) authors proved 
that r — 1 is achievable in the asymptotic case (n — > 00). This means that the highest density ratio that an algorithm can possibly handle 
is 7* = r c . Authors in J5J have shown that if the sensing matrix consists of i.i.d. Gaussian elements, then a decoder based on the £0 norm 
can recover the original signal with m = k + 1 measurements; i.e., r « 1. To find the solution based on the £0 recovery, however, one has 
to perform an exhaustive search, which is computationally too complex |6|. 

2 Authors in |7| extended the definition of RIP and showed that sparse matrices satisfy a generalized RIP constraint. The generalized RIP 
suffices for the linear programming decoders to succeed in recovering sparse signals. However, the resulting bound on the reconstruction 
error is weaker compared to the case where the sensing matrix satisfies the original RIP condition. 
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[14|-[17|. These are referred to as combinatorial algorithms. In the other group, recovery algorithms 
work with the bipartite graph associated with the sensing matrix by passing messages over the edges of 
the graph [[3j, p8|-p6|. These are referred to as message-passing algorithms. Combinatorial algorithms, 



generally, assume that the decoder knows the size of the support set, k [14j-[17|. These algorithms, 
have two main steps. In the first step, the algorithm outputs an estimate which has more nonzero values 
than the original signal. In the next step, knowing the parameter k, the estimate is pruned so that the 
output estimate has the same support size as the original signal. Combinatorial algorithms are, in general, 
more computationally complex than message-passing algorithms. For example, the algorithm introduced 



in [14 1 has complexity 0(k polylog(n)), which translates to 0{n polylog(n)) in the regime where k 
scales linearly with n. Message-passing algorithms, on the other hand, have computational complexity 

0{n). 

In this work, we are interested in low-complexity recovery algorithms that exploit the sparsity of the sensing 
matrix. In particular, we are interested in message-passing recovery algorithms. In [ fT9| , the authors propose 
a simple message-passing algorithm to reconstruct non-negative signals. This algorithm assumes lower 
and upper bounds for the signal elements. It then shrinks the difference between the two bounds through 
iterations. The important feature of the recovery algorithm introduced in JT9| is its uniform guarantee on 
signal reconstruction. Another approach in message-passing algorithms is to assume a prior distribution for 
the signal elements and try to maximize the a-posteriori distribution of the elements based on the observed 



measurements. In [26|, the authors assume Gaussian mixture priors. The main problem associated with 



this approach is that the length of the messages passed over the edges of the graph grow exponentially 



fast with the number of iterations. In another work p5| , the authors assume Jeffreys' priors p7| and 



aim at recovering the support set of the original signal using message-passing algorithms. Then, they 



apply well-known least-square algorithms, such as LSQR [28 1, to estimate the value of signal elements. 
Moreover, in [25], it is assumed that the size of the support set, k, is known. Algorithms discussed so far 
are either restrictive, in the sense that they assume some knowledge of the support set at the decoder, or 
have a high computational complexity that makes them impractical in applications with large n. 

In this paper, we are interested in a sub-class of message-passing algorithms called Verification-Based 
(VB) algorithms. These algorithms were originally introduced in the context of channel coding with non- 



binary alphabet [29|. Thanks to the connection between compressive sensing and linear channel codes 



over real numbers noted in B2JJ, the authors in [18| and [ 20 1 used VB algorithms in the context of 



compressed sensing. This class of algorithms has certain properties that make it perhaps one of the 
most interesting classes of recovery algorithms in compressed sensing. The VB algorithms recover signal 
elements in iterations. When an element is recovered, its value is kept unchanged in future iterations. 
The algorithms in this class have decoding complexity 0(n), which makes them suitable for applications 
involving recovery of signals with large n. Moreover, these algorithms operate on sparse sensing graphs, 
which translates to less computations in the encoding process. Another main advantage of VB algorithms 
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is that they are not sensitive to the distribution of nonzero elements of the sensing matrix as well as the 
distribution of nonzero elements of the signal, if certain conditions are satisfied. We will elaborate on this 



topic further in section III These properties make the VB algorithms a suitable choice for low-complexity 
recovery of sparse signals. The VB algorithms are, however, sensitive to the presence of noise in the 
measured data. In this work, our main focus is on the study of the noiseless case. This case is important 
because i) the noise-free analysis of recovery algorithms can serve as an upper bound for the performance 



of the noisy versions, and ii) noiseless compressive sensing has its own applications such as those in [24|, 



[30|. For the sake of being thorough and to demonstrate the potential of VB algorithms in recovering 



signals from noisy measurements, we will also comment on using standard thresholding techniques to 



deal with noisy measurements in Section VI An in-depth analysis of the approach, however, is beyond 
the scope of this paper. 

Another interesting feature of VB algorithms is that their performance can be analyzed in the asymptotic 
case (n — > oo). Assume a probabilistic input model, in which a signal element is nonzero (and takes a 
value from a certain distribution) with probability a and is zero with probability 1 — a. In the sequel, 
we refer to parameter a as the density factor. Furthermore, let cr^ denote the probability that a signal 
element is nonzero and unverified before iteration £ over the ensemble of all sensing graphs and inputs 
of interest. So, = a. If lim^oo ar> = 0, then the algorithm is called successful for the initial density 
factor On the other hand, if there exists e > 0, such that lim^oo > e, then the algorithm is said 



to fail for the initial density factor a. Using the combinatorial arguments in pT| and p2| , one can see 
that the algorithms complete the recovery once the probability — > 0. 



Authors in [20|, [22|, [23|, [29 1 have shown that for each VB recovery algorithm in the asymptotic regime 
as n — > oo and £ — > oo, a limiting value exists for a, before which the recovery algorithm is successful 
and beyond which it is not. We refer to this limit as the success threshold. The success threshold serves as 
an asymptotic measure of performance for VB algorithms. It can also be used to estimate the performance 
of these algorithms for finite but large values of n. To this end, researchers have analyzed VB algorithms 
in the asymptotic regime (n — V oo, i — > oo) in order to find the success threshold associated with each 
VB algorithm. There are two categories of VB algorithms: node-based (NB) and message-based (MB) 
p3| . The two categories yield different success thresholds and are analyzed using different techniques. 
In general, NB algorithms have higher success thresholds and are harder to analyze. We elaborate on the 



differences between the two categories and their corresponding analytical tools in Section |II-B| The focus 
of this work is on the analysis of NB algorithms. 



Asymptotic analysis of VB algorithms can be found in [20|, [23 1, [29]. Algorithms considered in [20|, 



29[ are of MB type, while the authors in [23 1 considered the NB type recovery algorithms. Moreover, for 



their analysis, the authors of p0| made the assumption that k/n — > as n - > oo. From a practical point 



It is easy to prove that the probability of a zero- valued signal element being unverified at iteration £ is upper bounded by d c j-g— ^ . 
Hence, when a tends to zero, this probability also tends to zero. 
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of view, however, it is more desirable to analyze the cases in which the number of nonzero elements of 
the signal, k, scales linearly with its dimension n. In fact, in this paper, we show that VB algorithms are 
capable of recovering signals whose density ratio is a nonzero constant, i.e., k/n remains constant, as n 
is increased. 



The analysis of NB-VB algorithms discussed in [23 1 results in a system of coupled differential equations. 



Due to the high complexity of solving the resulting differential equations, the authors used numerical 
methods to approximate the asymptotic results. They further assumed that the algorithms resolve at 
most one signal element in each iteration. Therefore, the number of iterations needed for the numerical 
analysis equals n; the number of signal elements. The challenges associated with the analysis of p3| are 
twofold: 1) as the analysis is only valid for n — >• oo, one has to choose very large n for the numerical 
approximation, which directly translates to long running time and high computational complexity, and 2) 
since the numerical approach is used to approximately solve the differential equations, the approximation 
errors can potentially propagate through the iterations. This makes it hard to evaluate how close the 
success threshold reported by this analysis (even for large values of n) is to the real success threshold. In 
comparison, the analysis proposed in this paper is much faster (by about two orders of magnitude for the 
tested cases) and is more robust against numerical errors. The general approach for the analysis is also 
different and is based on basic probability theory. 

The goal of this work is to develop a low-complexity framework for the asymptotic analysis (as n — > 
oo,£ — > oo) of NB-VB algorithms over sparse random sensing graphs and extend it to include recovery 
algorithms of similar nature such as that of OJ. In our analysis, we assume that the measurements are 
noiseless. The main analytical tool used in this paper is probability theory. We demonstrate that the 
recovery algorithms can be described by a first order time-varying Markov chain. We thus track the 
distribution of the states of this Markov chain through iterations in the analysis. The purpose is to find 
the transition probabilities between different states of the Markov chain as the iterations progress. The 
computational complexity of the proposed analysis thus increases linearly with the number of iterations. 
The calculation of transition probabilities includes simple mathematical operations, more specifically 
addition and multiplication, as opposed to solving complex systems of coupled differential equations, 
as is the case in p3| . One should however note that for a sensing graph in which each signal element 
affects d v measurements and each measurement is a linear combination of d c signal elements, where 
d v and d c are fixed and positive integers, the number of states in the proposed analysis is 0(d v + d 2 c ), 



which is in the same order as the number of differential equations in [23| is. As part of our asymptotic 
analysis, we also prove concentration results which certify that the performance of a recovery algorithm 
for a random choice of the input signal and the sensing matrix is very close to what is predicted by the 
density evolution results at the limit of n — > oo. 



Using the proposed analysis, we can determine the distribution of the decoder states at any desired 
iteration. By tracking the distribution of the decoder states with iterations, we then find the success 
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threshold of different NB-VB algorithms. Moreover, using our approach, we perform a comprehensive 
study and comparison of performance of different VB recovery algorithms over a variety of sparse graphs. 
Our simulations show that the behavior of VB algorithms, when applied to signals with large lengths (in 
the order of 10 5 ), are in good agreement with the asymptotic analytical results. 

The rest of the paper is organized as follows. In section |TTJ. we introduce the class of bipartite graphs 
and inputs signals of interest in this paper. We also provide a more detailed description of VB algorithms 



in this section. In section III, the decoding process in each VB algorithm is discussed further. Also in 
this section, we discuss the important notion of false verification and its probability in VB algorithms. A 



message-passing interpretation of the recovery algorithms is presented in section IV In this section, we 
also make a more detailed distinction between NB and MB recovery algorithms. The analysis framework 
will be introduced in section |Vj We propose a simple modification of VB algorithms to deal with noisy 
measurements in section VI Simulation results will be presented in section VII| Appendices [I] [TTJ. and III 



are devoted to the derivation of the transition probabilities. Appendix IV consists of some bounds needed 
for the concentration theorem, presented and proved in section |V} 



II. Background 



A. Ensembles of Sensing Graphs and Inputs 

A bipartite graph (or bigraph) Q{V LiC, E) is defined as a graph whose set of vertices V U C is divided 
into two disjoint sets V and C, so that every edge in the set of edges E connects a vertex in V to one in 
C. Corresponding to each such graph, a biadjacency matrix A(Q) of size \C\ x \V\ is formed as follows: 
the entry ay is 1 if there exists an edge ey G E connecting the vertex q G C to the vertex Vj G V; and 
is 0, otherwise. 

Let d v and d c be two positive integers. Consider a bigraph Q(VUC, E) with \V\ = n and \C\ = m, so that 
each vertex in V (C) is incident to d v (d c ) vertices in C (V). Clearly, nd v = md c . We refer to this bigraph 
as an (n, d V) d c )-biregular graph. The biadjacenecy matrix A(Q) associated to an (n, d V) d c )-biregular 
graph has d c l's in each row and d v l's in each column. 

Moreover, a bipartite weighted graph (or weighted bigraph) Q'(V U C, W(E)) is a generalization of the 
bigraph Q(V U C,E) in the sense that a weight u>y := iw(ey) G K\{0} is associated with each edge 
eij G E. The biadjacency matrix A(Q') corresponding to the weighted bigraph Q' is acquired from the 
biadjacency matrix A(Q) of the underlying bigraph Q by replacing nonzero ay values in A(Q) with Wij. 
A regular bipartite weighted graph (or weighted biregular graph) is defined similarly. 

For given parameters d v ,d c and n (m = nd v /d c ), let Q n (d v ,d c ) denote the ensemble of all (n,d v ,d c )- 
biregular graphs. Let us assume an arbitrary, but fixed, labeling scheme for vertex sets V and C over the 
ensemble. Further, let W be a matrix of size mxnof weights w drawn i.i.d. according to a distribution 
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f(w), and W™ xn be the ensemble of all such matrices. Now for any biregular graph Q(V U C, E) E 
Q n (d v , d c ) and any weight matrix W E W™ xn , we form the corresponding (n, d v , d c )-weighted biregular 
graph Q'(V U C, W(E)) as follows. To every edge E E, 1 < i < m, 1 < j < n, connecting ith vertex 
from C and jth vertex from V, we assign the weight w(eij) = Wij; i.e., the weight in row i and column 
j of the weight matrix W. Thus, we construct the ensemble of all (n, d v , d c ) -weighted biregular graphs, 
denoted by Q^[d v ,d^), by freely combining elements in Q n (d v ,d c ) and WJ lxn . 

Thus far, we have described the ensemble of graphs that are of interest in this work. In what fallows, 
we describe the ensemble of inputs of interest. Let a E [0, 1] be a fixed real number and v be a vector 
of length n with elements v f drawn i.i.d. according to a probability distribution function defined as 
follows: the element is zero with probability 1 — a, or follows a distribution g with probability a (i.e., 
Pr[vi = v] = ag(v) + (1 — a)S(v), where 5 is the Dirac delta function). We denote the ensemble of all 
such vectors by V™(a)|^] 

In compressive sensing, each measurement is a linear combination of the signal elements. With a slight 
abuse of notation, we use q (vj) for both the label and the value of the ith measurement (the jth 
signal element). We denote by c and v, the column vectors of the measurements Cj's (1 < i < m), 
and the signal elements v/s (1 < j < n), respectively. The underlying system of linear combinations 
can then be represented by the matrix multiplication c = Gv. In this paper, the sensing matrix G is the 
biadjacency matrix of a weighted bigraph Q(VUC, W(E)) drawn uniformly at random from the ensemble 
Gj(d v ,d c ). Henceforth, we refer to the graph Q as the sensing graph. Moreover, the signal vector v is 
drawn uniformly at random from the ensemble V"(a). The sets of signal elements and measurements are 
respectively mapped to the vertex sets V and C (\V\ = n, \C\ = m). The coefficient of the j th signal 
element (Vj E V) in the linear combination associated with the i th measurement q E C, the entry in 
G, is the entry Wij of the biadjacency matrix A(Q) of Q. 

Following the terminology frequently used in the context of coding^] we refer to the sets V and C as 
the variable nodes and check nodes, respectively. We will interchangeably use the terms variable nodes 
and signal elements as well as check nodes and measurements. The main focus of this paper is on the 
weighted biregular graphs. The results however, can be generalized to irregular graphs. 

B. Previous Work on VB Algorithms 

Luby and Mitzenmacher p9| proposed and analyzed two iterative algorithms over bigraphs for packet- 
based error correction in the context of channel coding. In these algorithms, a variable node can be in one 

4 It is worth noting that the expected fraction of nonzero elements in such a vector is a. Using a Cheraoff bound, it can be shown that the 
actual fraction of nonzero elements in a randomly chosen vector from this ensemble is tightly concentrated around its expected value (a) 
with high probability. 

5 In the context of coding, a linear code can be represented by a bigraph, where the two sets of nodes represent the code symbols, and the 
linear constraints that the symbols have to satisfy |33| . 
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of the two states: "verified" or "unverified". Under certain circumstances, a variable node is verified and a 
value is assigned to it. This node then contributes to the verification of other variable nodes. The decoding 
process continues until either the entire set of unverified variable nodes is verified, or the process makes 
no further progress while there are still some unverified variables. Due to the verification nature of the 
procedure, the two algorithms in [ |29| are called verification-based (VB) algorithms. If the assigned value 
to a variable node at a certain iteration is different from its true value, a false verification has occurred. In 
section [Till we discuss sufficient conditions for VB algorithms so that the probability of false verification 



is zero. 

The verification process in VB algorithms can be seen as a message-passing procedure. In general, a 
variable node sends its current state (either verified or unverified) to its neighboring check nodes along 
with its value (if verified). A check node processes the received messages and subsequently sends some 
messages to its neighboring variable nodes. Each unverified variable node decides on its next state, either 
verified or unverified, based on the received messages from check nodes. The process of passing messages 
between variable nodes and check nodes continues until all variable nodes are verified, or no variable 
node changes its state. 

In message-passing algorithms, a node can take two approaches in order to produce a message based on 
the set of received messages. In the first approach, the outgoing message is a function of all received 
messages. In this case, all messages leaving a node at a certain iteration are the same. In the second 
approach, the message passed from node a to node b in the bigraph, is a function of all the received 
messages by node a except the received message from node b. Therefore, the outgoing messages of a 
node at a certain iteration may be different, depending on the received messages. In the context of VB 
algorithms, the first approach is known as node-based (NB), while the second approach is called message- 
based (MB) [20 1, [|23||J^1 So, for a NB-VB algorithm, the state of a variable node is reported identically 



by all its outgoing messages, while in an MB-VB algorithm, different states may be reported by different 
outgoing messages from a variable node. 



As noted in [23 1, the authors in [29| defined the two VB algorithms using the NB representation but 



analyzed them using the MB representation. In [23 1, the authors proved that for one of the VB algorithms, 
the NB and MB versions perform the same, but for the other VB algorithm, the NB version outperforms the 
MB one. In compressed sensing, this implies that NB versions, in general, have higher success thresholds; 



i.e., can successfully recover signals with larger density ratios [21 1. 



A well-known method to analyze iterative message-passing algorithms in coding theory is density evolution 



[35 1 . In density evolution, the distribution of messages is tracked with the iteration number. The evolution 
of the message distributions with iterations will then reveal important properties of the decoding algorithm 
such as decoding threshold and convergence speed [j35j. The derivation of the message distribution 



6 In the context of iterative decoding algorithms, NB and MB approaches are known as non-extrinsic and extrinsic message-passing, 
respectively |34| . 
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however, requires the independence among the incoming messages to a node. The analysis is thus only 
applicable to extrinsic message-passing algorithms (MB decoders). To extend density evolution to NB 
algorithms, Zhang and Pfister p3| used a system of differential equations as the analytical tool. Applying 
their analysis to (d v , d c ) graphs, the number of differential equations is 0(d^ + d 2 c ). This rapidly becomes 
too complex to handle for large values of d v and d c ^\ Numerical methods were used in [ |2~3~| to solve the 
system of differential equations and consequently evaluate the performance of the NB algorithms. It is 
important to note that the authors in [23] analyzed a serial version of NB-VB algorithms, i.e., the version 
which allows only one variable node to be verified in each iteration. The total number of iterations in 
their calculations is thus equal to the size of the signal n. Another important issue regarding the analysis 



of [23 1 is that while the validity of the analysis is proved for the asymptotic scenario of n — > oo, the 



numerical results are highly dependent on the selected, large but still finite, value of n. 

III. VB Algorithms, Verification Rules and False Verification 
A. VB Algorithms and Verification Rules 

In compressive sensing, the decoder receives the vector of measurements and aims at estimating the original 
signal based on the knowledge of measurements and the sensing graph. In this section, we discuss four 
VB decoding algorithms. 

The first algorithm, here referred to as "Genie", is a benchmark VB algorithm in which the support set of 
the signal is known to the decoder^ We use the Genie algorithm and its analysis to motivate and explain 
the analytical framework. The success threshold associated with this algorithm serves as an upper bound 
for the performance of other VB algorithms. 

In other recovery algorithms, the decoder has no information about the support set. The next two decoders 
considered in this paper, are the two main VB decoding algorithms in the context of CS. The first algorithm 



is referred to as LM and is the first algorithm discussed in [20|; LM1. The second main VB algorithm is 



the algorithm introduced in [18|, which is the same as the second algorithm discussed in [20|; LM2. We 



refer to this algorithm as SBB. 



By the description given in Section II-B the algorithm in [[3j, here referred to as XH, also falls into the 



category of VB algorithms, and can be also analyzed using the proposed framework. The details of the 
analysis for this algorithm however, is not included in this paper. We just report some numerical results 



on the success threshold and the convergence speed of this algorithm in Section VII 



In what follows, we give the general description of the aforementioned VB algorithms, as found in the 
literature ( (3j, [20|, [22|, (23j). We then use this description to discuss the issue of false verification. In 



7 The number of differential equations for a (3,6) bigraph is about 30. 
8 The Genie algorithm is similar to the peeling algorithm over the BEC proposed in 31 
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Section [TV] we present the equivalent message-passing description of the VB algorithms. 



In the VB algorithms, when a variable node is verified at an iteration, its verified value is subtracted from 
the value of its neighboring check nodes. The variable node, then, is removed from the sensing bigraph 
along with all its adjacent edges. Hence, all the neighboring check nodes of the verified variable node 
face a reduction in their degree. In the next iteration, some variable nodes may be verified based on the 
degree and/or the value of their neighboring check nodes. The rules based on which the variable nodes 
are verified at each iteration are called verification rules and are as follows: 

• Zero Check Node (ZCN): If a check node has a zero value, all its neighboring variable nodes are 
verified with a zero value. 

• Degree One Check Node (D1CN): If a check node has degree 1 in a graph, its unique neighboring 
variable node is verified with the value of the check node. 

• Equal Check Nodes (ECN): Suppose we have N check nodes with the same nonzero value, then 1) 
all variable nodes neighboring a subset of these N check nodes (not all of them) are verified with 
the value zero; 2) if there exists a unique variable node neighboring all N check nodes, then it is 
verified with the common value of the check nodes. 



Verification rules ZCN and ECN are responsible for verifying variable nodes not in the support set. Since, 
the Genie algorithm has the complete knowledge of the support set, it has no need to apply these two 
rules. Hence, D1CN is the only rule used by the Genie. Other VB algorithms, each uses a combination of 
verification rules in order to verify and resolve unverified variable nodes. Assuming zero probability for 
false verification, the order in which the rules are applied does not affect the overall performance of the 
algorithm; it will only change the order in which variable nodes are verified. Verification rules adopted 
by different algorithms are summarized in Table [I] 

TABLE I 

Verification rules adopted in each VB algorithm 





ZCN 


D1CN 


ECN 


Genie 


Not Needed 


/ 


Not Needed 


LM 


/ 


/ 


X 


SBB 


/ 


/ 


/ 


XH 


/ 


X 


/ 



Based on Table [T| SBB applies the union of all rules to verify variable nodes. Therefore, this algorithm is 
expected to have the highest success threshold amongst the practical VB algorithms discussed here. This 
is verified in Section IVIII 

The ECN rule as stated above can not be easily captured in the analysis. Based on our extensive simulations, 
we conjecture that this recovery rule can be modified to read as follows (without affecting the asymptotic 
behavior of the recovery algorithms): 
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Modified Equal Check Nodes (MECN): Suppose we have N check nodes with the same nonzero value. 
Then if there exists a unique variable node neighbor to all such check nodes, it is verified with the common 
value of the check nodes. It is only in this case that any other variable node connected to such check 
nodes is verified as zero. 

B. False Verification 

Let /C denote the set of nonzero variable nodes in the signal; the support set. Also, let M (c) denote the 
set of variable nodes neighbor to a check node c. Now, consider the following facts: 

(1) Let C be an arbitrary subset of check nodes. If all the check nodes in C are neighbor to the same 
subset of nodes in /C, then all these check nodes have the same value. 

(2) Any check node with no neighbor in /C has a zero value. 

Verification rules ZCN and ECN in VB algorithms are designed based on the following assumptions: 

(1') Let C be any arbitrary subset of check nodes with the same value. Then all these check nodes are 

neighbor to the same subset of /C. 
(2') For any zero valued check node, none of its neighboring variable nodes belong to the set /C. 

It is worth noting that the assumptions (1') and (2') are the converses of the facts (1) and (2), respectively. 
To any choice of distributions / and g for nonzero weights of the sensing graph and nonzero signal 
elements, respectively, corresponds a certain probability that the converses fail to hold. Those distributions 
which make the converses hold true with probability 1 (almost surely), are of interest in this paper. In the 
following theorem, we give an example of distributions that make the statements (1') and (2') hold true 
almost surely. 

Theorem 1. Let C; L and Cj be two distinct check nodes and V« and Vj be their corresponding set of 
neighboring variable nodes in JC; i.e., V« = .M(cj) fl /C and Vj = M.{cj) fl /C. Suppose that at least one 
of the distributions f or g described before is continuous. Then the statements (V) and (2'), described 
above, are correct with probability one for Ci and Cj. 

To prove the theorem, we need the following fact. 

Fact 1. Let Xi and Xj be two independent samples drawn from a continuous distribution. It follows that: 

Pr (xi = Xj) = 0. 

Stated differently, no two independent samples of a continuous distribution will have the same value, 
almost surely. Moreover, for any constant c, we have 



Pr (xi = c) = 0. 
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Proof of Theorem Let G= [w^] be the sensing matrix. The value of a check node Cj is 
^2j-v j eM(c i ) w ij v j- Therefore, if at least one of the following conditions is satisfied, then the value of 
a check node follows a continuous distribution: 

1) the nonzero elements of the sensing matrix (weights associated to the edges of the sensing graph), 
follow a continuous distribution /. 

2) the value of a nonzero variable node (a variable node in the support set /C) follows a continuous 
distribution g. 

Therefore, according to Fact [TJ the subset of check nodes with the same value can not be independent 
almost surely. Their dependence implies that they are neighbor to the same set of nonzero variable nodes. 
This proves the statement (1'). Similarly, if a check node is neighbor to at least one nonzero variable 
node, its value follows a continuous distribution, which according to Fact [T] is zero with probability zero. 
This proves the statement (2'). ■ 

So, the continuity of / or g is a sufficient condition to have the probability of false verification equal 
to zero. In the rest of the paper, we assume that the statements (1') and (2') are correct with probability 
one and consequently, the probability of false verification for a variable node in any iteration of the VB 
algorithms is zero. Using the union bound, one can see that the probability of false verification in any 
iteration and also in the whole recovery algorithm is zero. 

IV. VB Recovery Algorithms as Message-Passing Algorithms 
A. Definitions and Setup 

There are a number of VB decoding algorithms that can be formulated as node-based message-passing 
(NB-MP) algorithms. These are the algorithms that are of interest to us in this paper. Each algorithm 
works in iterations through exchanging messages between the check nodes and the variable nodes along 
the edges in the graph. Any message sent from a variable node to its neighboring check nodes belongs to 
an alphabet set M. : {0, 1} x K. The first coordinate of such a message is a status flag, sometimes referred 
to as "recovery flag", taking binary values. The flag indicates the verification status of the variable node. 
If this flag is 0, then the variable node is not verified. If, on the other hand, the flag is 1, then the variable 
node has been verified. In this case, the second coordinate, which is a real number, is interpreted as the 
verified value of the variable node. 

Similarly, any message sent from a check node to all its neighboring variable nodes belongs to an alphabet 
set O : Z + x R. The first coordinate of such a message indicates the number of unverified variable nodes 
neighbor to the check node. The first coordinate is in fact the degree of the check node in the subgraph 
induced by the unverified variable nodes. The second coordinate indicates the current value of the check 
node, i.e., the result of the linear combination of the unverified neighboring variable nodes. 
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The edges, in NB-MP algorithms, do not simply forward messages from check nodes to variable nodes and 
vice versa. Instead, based on the traveling direction of the message, edges multiply or divide the second 
coordinate of the message by their associated weight. More specifically, if the message is sent from a 
variable node to a check node, its second coordinate is multiplied by the weight. The second coordinate 
of the message is divided by the weight, if the message is sent from a check node to a variable node. So, 
although messages generated by a node (either variable node or check node) are sent identically over all 
adjacent edges, the fact that the edges may have different weights will result in different messages being 
received at the destination nodes. All such messages are independent if the weights associated with the 
corresponding edges are independent. 

Any iteration £ > 1 in NB-VB algorithms, consists of two rounds, each with two half-rounds. In each 
round, every check node processes all received messages from the previous round together with its 
associated measurement and sends out a message from the alphabet O to all its neighboring variable 
nodes (first half-round). In the second half-round, each (unverified) variable node decides on its next 
state by processing all its received messages. Whichever the decision is, the variable node sends back 
a message, from the alphabet A4, to all its neighboring check nodes. So, a round starts with check 
nodes processing the received messages from neighboring variable nodes, proceeds with the transmission 
of messages from check nodes to variable nodes, continues by variable nodes processing the received 
messages from neighboring check nodes, and ends with the transmission of messages from variable nodes 
to check nodes. The two rounds in each iteration follow the same general structure. They only differ in 
the processing carried out in the variable nodes. 

Let $u : O dv — > M. and : O dv — > M., £ £ N, represent the mappings used at any unverified 

variable node to map the incoming messages to the outgoing message in the first and the second round 
of iteration £, respectively. Obviously, due to the verification-based nature of the algorithms, when a 
variable node becomes verified at an iteration, its outgoing message remains unchanged, irrespective of 
its incoming messages. In contrast to the variable nodes, the mapping function used in check nodes is 
identical for both the first and the second round of each iteration. Every check node Cj, i £ [m] has 
an associated received measurement a random variable taking values in R. So, we use the notation 
$ { c e) : R x M dc ->■ O, £ £ N, to denote the mapping function used in all check nodes at iteration £. For 
the sake of completeness, let <3>i°^ = <£>1 2 ' ^ : O dv — > Ai and : R — > O represent the mappings used, 
respectively in all variable nodes and check nodes at iteration 0. This iteration consists of only one round. 
For the VB algorithms under consideration, the mapping functions in the variable nodes and check nodes 
are not a function of the iteration number. Therefore, we omit the superscript (£) henceforth. 



In what follows, we describe VB algorithms of Section 



III 



as message-passing algorithms with the general 
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structure explained abovej^] 



B. Message-Passing Description of Recovery Algorithms 

To describe the four VB recovery algorithms using the message-passing approach, we need to define 
the mappings , and $ c . Mapping embeds the verification rules D1CN and ECN, while the 
mapping $1, embeds the ZCN rule. To make the description of mappings & v and $ c simpler, we introduce 
some notations to represent the incoming messages to variable and check nodes from the alphabet sets 
O and /A, respectively. A message o 6 O, incoming to a variable node, is an ordered pair of elements 
where d G Z ,f G R. A message m £ M., incoming to a check node, is an ordered pair of 
elements (s,co), where s £ {0, 1}, to £ K. Moreover, we assume that there is an arbitrary numbering for 
edges adjacent to a node (either variable node or check node). So, we use the notations Oi,i £ [d v ] and 
rrij,j G [c? c ], to denote the incoming messages to variable nodes and check nodes, respectively. 

At iteration zero, all variable nodes are unverified and there is no received message at the check nodes. 
At this stage, all check nodes send their corresponding measurements along with their degree (d c ) to their 
neighboring variable nodes. For the following iterations £ > 1, the mapping function at any check node 
Ci is as follows: 

d c d c 



$ c (cj, mi, • • • , m dc ) = (d c - Si, Cj — y~] s^), 



where, is the measurement associated with the check node q, and m,j = (s 4 , u^) is the message received 
along the ith edge. The mapping functions , $1 2 ' ) are algorithm dependent and are discussed for each 
VB algorithm separately next. 

The decoder stops at an iteration £, £ > 1, if the algorithm makes no further progress, i.e., the set of 
verified variable nodes are the same for the two consecutive iterations £ — 1 and £. Equivalently, the 
algorithm stops if the messages sent from variable nodes to check nodes, and also from check nodes to 
variable nodes, are the same for two consecutive iterations £ and £ — 1. At this point, if the decoder is 
able to verify all the variable nodes, then the decoding is called successful. Otherwise, the decoder will 
declare a failure. 

Genie 

In this algorithm, the decoder has the knowledge of the support set /C. So, the verification rules ZCN 
and ECN are not needed for this algorithm. Hence, each iteration in this algorithm consists of only one 
round, in which one verification rule (D1CN) is applied to all variable nodes. For variable nodes not in 



9 It is worth mentioning that the message-passing description of the NB-VB algorithms, presented in Section IV-B is only valid for the 



cases in which the nonzero weights of the sensing graph are drawn from an uncountable or countably infinite alphabet set. If the elements 
of the sensing matrix are drawn from a finite alphabet set, such as binary and 1, the outgoing messages from a check node should also 
include the list of all unverified variable nodes neighbor to the check node. The mapping function in the variable nodes should also change 
in order to use the extra information in the incoming messages. 
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the support set, the outgoing message in all iterations is fixed and equals m = (1,0). 

For any variable node in the support set, the mapping $ v (oi, • • • , o^) is defined based on the following 
rules. 

• If among all received messages (from the neighboring check node), there exists only one message, 
say Oj, i G [d c ], such that Oj = (1, G R, then $„(oi, • • • , o^) = (l,£j). In this case, the variable 
node is verified with the value 

. If multiple messages exist in the form (1,£) (any £ G R), then choose one at random, say Oj = 
(1, &),& G R, and set $„(oi, • • • , o^J = (l,£j). In this case, the variable node is verified with the 
value £j. 

• If none of the above happens, then $„(oi, • • • , o dv ) = (0, 0). In this case, the variable node is still 
unverified. 

LM 

For any unverified variable node in this algorithm, the mappings and $i 2 ^ are defined according to 
the following rules. 

. ^(oir- - ,o dv )- 

- If among all received messages (from the neighboring check nodes), there exists only one 
message, say o u i G [d c ], such that o { = G R, then $^\o ir -- ,o dv ) = In 
this case, the variable node is verified with the value 

- If multiple messages exist in the form (1,£) (any £ G R), then choose one at random, say 
Oj = (1, & G R, and set (oi, ■ ■ ■ , o^) = (1, ^). In this case, the variable node is verified 
with the value £j. 

- If none of the above happens, then $^^(01, • • • , o dv ) = (0, 0). In this case, the variable node is 
still unverified. 

. $l 2) ( 0l ,-- - ,o dv y. 

- If there exists at least one message o t such that Oi = (di, 0) (for any dj G Z + ), then 

$i, (01, • • • , o^) = (1, 0). In this case, the variable node is verified with the value equal to 0. 

- If no incoming message exists in the form (dj, 0) (for any di G Z + ), then 

(<?!,••• , o d J = (0, 0). In this case, the variable node is still unverified. 

sbbB 

For any unverified variable node in this algorithm, the mappings §^ and are defined by the following 
rules. 

I0 The original recovery algorithm introduced in |l8) has a computational complexity of 0(m ■ logm) (which translates to 0(n ■ logn) 
when using biregular graphs of fixed degrees). It is easy to prove that the message-passing description provided here does not change the 
recovery capability of the algorithm but results in the reduction of decoding complexity from 0(n ■ logn) to O(n). 
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. ^(oi,-- - ,o dv ): 

- If among all received messages (from the neighboring check nodes), there exists only one 
message, say o u i G [d c ], such that o { = e R > then ^(oi,--- ,o d J = In 
this case, the variable node is verified with the value £j. 

- If there exist N messages (2 < N < d v ) o h = (d H ,^ h ), o i2 = (d i2 ,£ i2 ), ■ ■ ■ , o lN = (d iN ,^ N ), 
such that = 6 2 = " ' = &n> then (01 , • • • , o<2„) = (1, I n this case, the variable node 
is verified with the common value £i t p] 

- If a variable node is verified to different values according to verification rules above, then choose 
one at random and generate the outgoing message accordingly. 

- If none of the above happens, then <3>^(oi, • • • ,o dv ) = (0, 0). In this case, the variable node is 
still unverified. 

. $l 2) (o!,-- - ,o dv y. 

- If there exists at least one message Oi such that Oi = (di, 0) (for any di G Z + ), then 

<H (oi, ■ ■ ■ , o dv ) = (1, 0). In this case, the variable node is verified with the value equal to 0. 

- If no incoming message exists in the form (di, 0) (for any di G Z + ), then 
<H (oi, ■ ■ ■ , o dv ) = (0, 0). In this case, the variable node is still unverified. 

XH 

For any unverified variable node in this algorithm, the mappings and $i 2 ^ are defined according to 
the following rules. 

. fc^foi,-- - ,o dv ): 

- If there exist M messages (\d v /2] < M < d v ) o h = {d^,^), o h = (d i2 ,^ 2 ), o iM = 
( d i M ,£i M ), such that & = & 2 = • ■ • = £ iu , then $\P(oi, ■■■ ,o dv ) = (1, ^J. In this case, the 
variable node is verified with the common value £i v 

- If a variable node is verified to different values according to the verification rule above, i.e., if 
two groups of messages both at least of size d v /2 satisfy the above condition, then choose one 
at random and generate the outgoing message accordingly. 

- If none of the above happens, then <3>^(oi, • • • ,o dv ) = (0, 0). In this case, the variable node is 
still unverified. 

. $l 2) (o!,-- - ,o dv ): 

- If there exists at least one message Oj such that Oj = (di, 0) (for any di G Z + ), then 

<&i, (oi, ■ ■ ■ , o dv ) = (1, 0). In this case, the variable node is verified with the value equal to 0. 

- If no incoming message exists in the form (d i} 0) (for any di G Z + ), then 

(2) 

§v' (oi, ■ ■ ■ , o dv ) = (0, 0). In this case, the variable node is still unverified. 

11 We note that the message received by the variable node equals the message sent from the check node divided by the weight of the 
connecting edge. Therefore, receiving two messages with the same value would imply that, almost surely, the unverified variable node under 
consideration is the unique nonzero variable node neighbor to the iV check nodes. Other unverified variable nodes neighbor to these iV 
check nodes do not belong to the support set and should be verified with a value equal to zero. This, however, happens in the next round. 
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C. A Short Note on False Verification 

In the above description of recovery algorithms, there may be cases where a variable node can be verified 



to different values by different rules. Using the same assumption made in Section III-B , it is easy to see 
that the probability of this event is equal to zero. In such cases, we have thus assumed that the variable 
node is verified by one of the rules selected randomly. Clearly, the probability of false verification as a 
result of such selection is zero. 



V. Asymptotic Analysis Framework 

In this section, we first show that (i) the performance of a realization of the sensing graph, with a certain 
selection of the edge weights for the recovery of a realization of the input signal concentrates around the 
average performance of the ensemble (where the average is taken over all the elements in the ensemble 
Gf(d v , d c ) x Vg(a), for given probability distribution functions /, g, and given constant parameters d v , d c 
and a), as n tends to infinity, and (ii) the average performance of the ensemble, as n goes to infinity, 
converges to the performance of the cycle-free case defined as follow sf^j 

Let Ml 1 be the neighborhood of node v of depth 2£, i.e., the subgraph consisting of the variable node v 
and all those nodes that are connected to v with any path of length less than or equal to 2£. We say that 
we are working under the cycle-free assumption when for a fixed £, and for every v, is tree-like. 



Similar to [35 1, the concentration result presented here is valid for any iteration number i. We thus often 



omit the superscript £ in the notations. 

Following the concentration results, we present the asymptotic analysis of the Genie algorithm. The 
analysis provides us with the average ensemble performance for the asymptotic case of n — > oo. We then 
generalize the concepts used in the analysis of Genie and analyze LM and SBB algorithms. The analysis 
of XH is similar and is omitted to prevent redundancy. However, we shall report the thresholds of this 



decoder for different biregular graphs in section VII 



A. Concentration Results and Convergence to Cycle-Free Case 

Consider a weighted graph selected at random from Gf(d v ,d c ). Such a graph can be represented by its 
corresponding unweighted graph G (from the ensemble G n (d v , d c )) along with its vector of edge weights 
w (the vector of elements, with an arbitrary fixed order, in an m x n matrix W from the ensemble WJ lxn ). 
Also consider an input signal vector v chosen randomly from V"(a). Suppose that a VB algorithm is 
applied to the measurement vector c = Gv to recover v iteratively, where G is the biadjacency matrix 

I2 Our method of proof is very similar to that of |35) , though due to the differences in the nature of the problems (channel coding vs. 
compressed sensing) and the difference in the update rules at the graph nodes, some arguments are revised and some new components are 
added to the proof. 
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of the chosen weighted graph. For this scenario, let (3^ (:= (3^(G,w,v)) be the fraction of unverified 
nonzero variable nodes at the beginning of iteration £, i.e., the fraction of variable to check node messages 
passed along the edges of the chosen weighted graph with unverified status (sent by nonzero variable 
nodes); further, let E[/3^] denote the expected value of (3^\ where the expectation is taken over the 
ensembles Qj(d v ,d c ) and V"(a). Now, consider the corresponding cycle-free case, and let be the 
expected number of messages with unverified status passed along an edge emanating from a nonzero 
variable node with a tree-like neighborhood of depth at least 2£ at the £th iteration. Here, again, the 
expectation is taken over the ensembles W™ xn and V"(a). 

In the subsequent subsection, we will show how a" can be calculated. It should be clear that oS®, being 
defined as the "average" over the ensemble of weights and input vectors, is the same as the "probability" 
that a message from a non-zero variable node with a tree-like neighborhood of depth at least 21, at the 
£th iteration, carries an unverified status. In this section, we use the interpretation of as an average. 
The interpretation of cv® as a probability will be used in the analysis section. In the following, we will 
show that over all realizations, with high probability, (3^ does not deviate much from E[/3^], and E[/3^], 
itself, is not far from aS l \ as n tends to infinity. 

Theorem 2. Over the probability space of all weighted graphs Gf(d v ,d c ), and all signal inputs V"(a), 
for a fixed £, letting (3^ and a" be defined as above, there exist positive constants fj l (d v ,d c ,£) and 
j(d v , d c , £), such that (i) for any e > 0, 

Pr [\(3^ - E[/?W]| > e/2] < 2e~ e%n ^, (1) 

and (ii) for any e > 0, and n > 2^/e, 

| E [/?W] - a W| < e/2. (2) 

Note that combining ([T]) and ([2]), the following holds: for any e > 0, and n > 2 r y/e, 

Pr[|/3W- a W| > e ] <2e^ 2n ^. 

Hereafter, for ease of notation, we drop the superscript £, as we are studying the parameters of interest 
at a fixed iteration £. 

Proof: We start by proving ([T]). Let the triple T := (G, w, v) represent one particular realization G, 
w, and v of the ensembles of graphs, edge weights and input vectors, respectively. For any i, < i < 
(2d v + l)n := £, the ith element of T, with an arbitrary order which is fixed during the discussion, is 
referred to as the ith coordinate of T. Consider two arbitrary realizations T' and T" . Let the symbol "=i," 
for all < i < £, be a binary relation as follows: X" =j T" implies T" =i-\ T", where T' =j T" if the 



19 



first i coordinates of the triples T" and T" are the samepj Suppose that we first expose the nd v edges of 
the graph one at a time. In the next nd v steps, we proceed with exposing the nd v edge weights one by 
one, and finally, we expose the n variable nodes' input values one at a time. Then, T =$ T" if and only 
if the information that we reveal in the first i steps of exposure for both triples is the same. 

We construct a martingale sequence /3 , Pi, /% by defining 

f3 l (T) = E[f3(T')\T' =i T}. 

(Po is a constant, i.e., the expected value of /3(T') over all graphs G', all edge weights w', and all input 
vectors v'; 0$ is f3(T) itself.) By the application of Azuma's inequality (see [36, Chapter 7]), one can 
give an upper bound on 

Pr [\p(T) - E [/3(T')]| > e/2] = Pr [|& - /3 | > e/2] , 

so long as for all < i < £, \Pi+i — P%\ < Aj, for some bounded Aj, as n tends to infinity. 

In the following, we find Aj, for all < i < £, for the Genie algorithm, and by following similar steps, 
one can find Aj for the other VB algorithms. We will explain the details of the proof for Genie, and the 
proofs for the other algorithms will not be presented as the method of the proofs is quite similar. 

First, consider the steps where we expose the edges of the graph, i.e., for all < i < nd v . We want to 
upper bound 

|A +1 (T)-A(T)|. 

Let G(G, i) be the set of graphs in the ensemble G n (d v , d c ), such that their first i edges are the same as the 
edges in G, and let Qj(G, i) be the subset of Q(G, i) whose (i + l)th edge from the perspective of variable 
nodes is the jth edge from the perspective of check nodes. It should be clear that G(G, i) = Uj e j z Qj(G, i), 
where Jj is the set of indices of those edges from the perspective of check nodes that have not been exposed 
before revealing the (i + l)th edge from the perspective of variable nodes. Thus, by definition, 

P*(T) = E[/3(T')|G' eG(G,i)] 

= $>[/3(T')|G'G^(G,z)] 

•Pr [G' E Qj(G,i)\G' e Q{G,i)\ , (3) 

and 

A +1 (T) = E[/3(T / )|G / e^(G, i )], 

13 An example: If, for i — nd v , we have T' =i T" , then the two realizations T' and T" have the same graph, but may have different 
edge weights and different input vectors. 
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for some j G J*. Since J2jeJi P r [G" e Gj(G,i)\G' G Q{G,i)\ = 1, by using ([3]), one can show that 



A CO > nunE[/3(T')|G / G^(G,i)] 
= E[^(T')|6?'eg fc (G,i)], 



(4) 



for some k G Jj. Therefore, 



|A+i(T)-A(T)| 

< maxlE^TOIG'G^-CG^)]-/?^)! 

< max |E [(3{T')\G' G Qj{G,i)\ - E [/3(T')|G" G G k (G,i)\\ . 

Then, for all 1 <j,k < md c , we need to bound the right hand side of the above inequality. 



(5) 
(6) 



Let <f>j t k : Qj(G,i) — > Gk{G,i) be a map such that for any given graph iJ G Gj(G,i), the graph ii 7 : = 
4>j t k(H) is the same as if, except in one pair of edges, i.e., if the jth and the A;th edges in H from the 
perspective of check nodes are the (i + l)th and the i'th edges from the perspective of variable nodes, 
respectively, then, these two edges will be the i'th and the (i + l)th edges in H' from the perspective 
of variable nodes, respectively. By construction, (f>j k is a bijection, and it preserves probabilities. Thus, 
E[p(G',w',v')\G' G G k (G,i)} = E[P{(j)^ k {G'),w',v')\G' G Gj(G,i)\. Further, since the graphs H and H' 
defined as above are different only in one pair of edges, we will show later that for each VB algorithm 
of interest, regardless of the choices of w and v, the difference between f3(H,w,v) and f3(H' ,w,v) is 
bounded from above. To be precise, in the case of the Genie algorithm, for all edge weights w, and for 
all inputs v, we can write 

\P{H,w,v)-P(<l> j)k (H),w,v)\<dt/n. (7) 

The proof of inequality @, and similar results for the cases of LM and SBB algorithms are given in 
Appendix IV-B| 14 By ([7J), any pair /3(H,w,v) and f3((fij^(H),w,v) has a bounded difference and thus, 
for any j,k G J,;, 

|E[/3(T')|G" G GAG,i)} -E[f3(T')\G' G Q k (G,i)]\ 
= E[(3'\G' G Gj(G,i)] - E[^JG' G ^(G,i)] 

= Etf'-^JG'eGjiGJ)] 

<n\P , -%J\G , eg j (G,i)] 
< max \p(G',w',v') - {}((/) jk (G'),w',v')\ 

G'.w'v' 



< di/n, 



(8) 



l4 Arguments similar to what was used in upper bounding the difference between /3(T) and j3(T'), when G and G' are different in one 
and only one pair of edges, will be used to give an upper bound on fi(T) and /3(T'), when (G, w) = (G' , w'), but v and v' are different 
for one and only one variable node. Such a bound will be used later on in the proof. 
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where /?' := 0(G',w',v'), and 



f3((f)j tk (G'),w',v'). Thus, combining (|6]) and ©, for all < i < 



nd v , one can take Aj = d^/n. 

In the following, we shall find Aj for all nd v < i < 2nd v , or all 2nd v < i < £, i.e., when we reveal the 
edge weights or the input values one at a time, respectively. 

Since all the edge weights are nonzero and drawn independently from a continuous alphabet, one can see 
that for all nd v < i < 2nd v , and for all realizations T, /3j(T) = (5 i+ i(T). We can, therefore, take Aj = 0. 

The method of upper bounding the difference between /3j(T) and (3 i+ i(T), for all 2nd v < i < £, is 
different than what was used earlier. By definition, 0i(T) is the expected value of the random variables 
/3(T'), for all realizations V, such that T =j T. For a given T, consider the value of A(T). In the 
corresponding expectation, the value of (i + l)th coordinate of T is free (not fixed). It should be clear that 
(3i(G, W, R) cannot be less than the expected value of (3(T'), where T' =, T, and the [i + l)th coordinate 
of T" is zero. Now, consider the value of (3 i+ i(T). This quantity cannot be larger than the expected value 
of f3(T"), where T" =j T, and the (i + l)th coordinate of T" has a nonzero value. For a given T, for 
any realization T' in the ensemble over which we take the average in order to calculate /3j(T), there are 
counterpart realizations T" in the ensemble over which we take the average with regards to (3 i+ i (T) with 
the following properties: (i) (G",w") = (G',w'), and v" and v' are different only in the value of the 
[i + l)th coordinate of the underlying realizations T or T", and (ii) the number of edges emanating from 
nonzero variable nodes which carry unverified status messages in T" is not larger than that in T" . Let 
T(i + 1), T'(i + 1) and T"(i + 1) represent the (i + l)th coordinate of the realizations T, T and T", 
respectively. From the above argument, one can conclude that (i) if T(i + 1) = 0, then 



0i+i(T) < Pi{T), 



(9) 



and (ii) if T(i + 1) ^ 0, then 



0i(T) < ft+i(T). 



(10) 



Furthermore, by definition, it should be clear that (i) if T{i + 1) = 0, 



A(T) < E[/3(T")|T // =j T, T"{i + 1) ^ 0], 



(11) 



and 



A +1 (T) = E[/3(T')|T' =j T,T'(i + 1) = 0] 



(12) 



and (ii) if T(i + 1) ^ 0, 



Pi(T) > E[P(T')\T' =j T,T'(t + 1) = 0], 



(13) 



and 

A+i(T) < E[/9(T")|T" =j T,T"(i + 1) ^ 0]. 



(14) 
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Thus, combining (|9]), ( [TT] ) and ( fT2| ) in the case of T(i + 1) = 0, or combining ( fTO] ), ( fT3| ) and ( p~4| ) in the 

case of T(i + 1) ^ 0, one can see that 

|A(T)-/3 m (T)| 

< |E[/3(T')|T' =i T,T'(i + 1) = 0] - E[/3(T")|T" =< T, T"(i + 1) ^ 0]| 



<™f^( T ')-/3(ni 



(15) 



where the maximization is over all the realizations T and T", such that T =, T, T" =j T, and T" and 
T" are the same except in their (i + l)th coordinate which is zero or nonzero in T' or T", respectively. 



By using the bounds for Genie algorithm in Appendix IV-B for all possible realizations T' and T" which 
satisfy the above conditions, one can write 



\/3(T') - {3(T")\ < dl/n. 
By combining ( fT5| ) and ([16]), for all 2nd v < i < £, we can have Aj = d^/n. 
Now, applying Azuma's inequality, i.e., for any A > 0, 



(16) 



Pr[|/3-E[/3]| > A] < 2e 2 ^°<^ A t , 
followed by setting A = e/2, for any e > 0, we obtain 

Pr[|/3-E[/3]| > e/2] < 2e~ e2n ^, 

where /i = 8(d v + 

Similarly, for each of the LM and SBB algorithms, by replacing the value of Aj, for each < i < 
according to the bounds given in Appendix IV-B , one can prove inequality ([T]), for some constant \i which 
depends on the recovery algorithm and the parameters £, d v , and d c . 

To complete the proof of Theorem [2| we now prove inequality (Q. Let E[/3j], for all 1 < i < nd v , be the 
expected number of variable to check node messages with unverified status passed along the ith edge, 
denoted by e i? connected to a nonzero variable node vfe). Note that the expectation is over all graphs, 
all signal inputs and all edge weights. Then, by linearity of expectation, from the definition, it follows 
that 



Ef 



E[A]/n^ 

i£[nd„] 

E[/3i], 



where we have used the fact that due to the regularity of the graph and having i.i.d. edge weights and 
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input vector's elements, E[/3j] = E[/?i], for all i 6 [nd v ]. Clearly, 

E[ft] = E[/3i|jV^ ei) is tree-like] ■ Pr[Af^ ei) is tree-like] 

+E[/3!|AC 2 f ei) is not tree-like] • Pr[A/" 2 ( £ ei) is not tree-like]. 



In Appendix IV-A the probability that, for a variable node v, Af^ is not tree-like is upper bounded by 



7/n, where 7 is a constant with respect to n, i.e., 

PrUV-f is not tree-like] < -. 

n 

Further, by definition, E^il^V^fas tree-like] = and obviously, |E[/3i|jV 2 / ei as not tree-like] | < 1. 
Therefore 

- 7 /n) < E[(3] < a (e) + 7/n, 

which results in 

|E[/3] -a {i) \ < 7 /n. 
Lastly, for any e > 0, by taking n > 27/e, we get 

|E[/3] <e/2, 

which completes the proof. ■ 
B. Analysis of the Genie 

In the Genie algorithm, the support set is known. Therefore, the set of all variable nodes can be divided 
into two disjoint sets: verified 1Z and unverified /C. At iteration zero, variable nodes in the support set are 
unverified, and the zero-valued variable nodes (variable nodes not in the support set) belong to the verified 
set (with a verified value equal to zero). In future iterations, during the verification process, variable nodes 
are removed from the set /C and added to the set 1Z. We use the notation /C^ and to denote the set 
of unverified and verified variable nodes at (the beginning of) iteration £, respectively. We also use the 
superscript i to indicate the iteration number for all the other sets in this section in the same way. 

Each iteration of the Genie algorithm consists of only one round (two half-rounds). At a generic iteration 
i, in the first half-round, check nodes process the received messages from variable nodes sent at iteration 
t — 1 and generate outgoing messages to be delivered to variable nodes. As we shall see, check nodes 
are grouped based on their degree also reflected in their outgoing messages. A check node with degree 
j before half-round 1 may have degree i < j after half-round 1. The set of check nodes with degree i 
after the (processing in the) first half-round of iteration I, is represented with Mf . In the second half- 
round, the variable nodes process the incoming messages and generate outgoing messages accordingly. 
Variable nodes, are also grouped based on the number of neighboring check nodes of degree 1. The group 
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of unverified variable nodes with % neighboring check nodes of degree 1 after the (processing in the) 

(£ 2) 

second half-round of iteration £ is represented with /Q . Note that the grouping of check nodes remains 
unchanged during the second half-round of the same iteration. In a similar way, the grouping of variable 
nodes remains unchanged during the first half-round of the next iteration. 

In the first half-round of iteration zero, every check node sends its corresponding measurement value along 
with its degree, d c . In the second half-round, variable nodes in return a verified message with a value 
equal to 0, while variable nodes in K,^ return a message with the status bit indicating their unverified 
status. At iteration 0, the set /Cq ' 2 - 1 includes all unverified variable nodes K,^\ Therefore, at this stage, 
no additional variable node can be verified because all incoming messages to unverified variable nodes 
have d c in their first coordinate. 

In the first half-round of iteration 1, received messages from variable nodes are processed at check 
nodes and outgoing messages to variable nodes are generated. Each such message has the following two 
properties: 1) the second coordinate of the message is the same as the second coordinate of the message 
sent over the same edge in the same half-round of iteration 0, and 2) the first coordinate of the message 
is at most the same as the first coordinate of the message sent over the same edge in the same half-round 
of iteration 0. The second coordinates are the same because no variable node from the support set was 
verified at iteration 0. The first coordinates may be different because the variable nodes not in the support 
set (JZ^) have been revealed thus reducing the number of unverified variable nodes connected to the 
check nodes. We use the notation J\f^ to refer to the set of check nodes Af- ' 1 ^ that are categorized in 
Nj 1,1 ^. The arrow points downward to emphasize that j < i. Note that for iteration 1, % — d c . 

In the second half-round of iteration 1 and after receiving the messages from check nodes, variable nodes 
in /Cq ' 2 ^ should be regrouped in K?f ,2 \ < j < d v . We denote by /C-jj the set of variable nodes in Kf^ 

(1 2) 

joining the set /Q ' . In this case, j > i, hence the use of the arrow pointing up. Based on the verification 
rule for the Genie, at any iteration £ if an unverified variable node is neighbor to at least one check node 
in the set it will be verified. So, variable nodes in the set (Jj=i K-f MQ verified by the end of 

iteration 1. Therefore, the new sets and IC^ to be used at iteration 2 are calculated as follows. 

= U {/C?' 2) , K*F\ • • • , *£' 2) } , K (2) = Kg*. 

The process of sending, receiving, and processing messages between check nodes and variable nodes as 
well as the verification process continues in next iterations in the same fashion as we discussed above. In 
summary, in a generic iteration I, we have the following relationships: 

*?-'■"= E<, <=».!.• -.4. <i=°. 

3=0 

Af'^XX*. i = 0,l,-,4. 

i=j 
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£0-1,2) _ \p ^,2) _ £(<>) „• _ o I . . . J 

By tracking the set 10® with iterations, we can decide on the success or failure of the algorithm. If the 
size of the set KS® shrinks to zero as t — > oo, then the algorithm is successful. On the other hand, if 
there exists an e > such that |/C^| > e, W > 1, then the algorithm is not successful. The success or 
failure of an algorithm depends on the parameters of the graph (d v and d c ) as well as the initial size of 
the support set \JC^\. 

To be able to use the concentration results discussed before and analyze the Genie in the asymptotic case, 
we track the probability a" that a variable node belongs to the set /C^. Hence, we focus on a tree-like 
graph with random weights and a random input signal (random support set and random nonzero values). 
Let pff X \ < i < d c , denote the probability that a check node belongs to the set Furthermore, 
let P^' 2 \o < j < d v , denote the probability that an unverified (nonzero) variable node belongs to the set 
10f . In what follows, we show the step-by-step procedure to find the update rules for probabilities P^\ 
P^' 2 , and a^ e+1 \ in terms of probabilities a^ 1 ^, af®, P^ l,l \ and P%~ ■ The formulas are calculated 
using probabilistic/counting arguments. The details can be found in Appendix |IJ 

1) Find the set of probabilities V^ l \i = 0, • • • ,d c from: 



where, 



P% ii =( j t i )(A®) i - i (l-AV) i , 2<j<d c , 0<i<j, 

D (/-l,l) 

= 1 - (1 - p^) dv ~\ = 



V-Vd r 



a 



(l 2) 

2) Find the set of probabilities P K \ from: 



v m = fd v \ ^ p{e+1) y ^ _ p{l+l) y,-3 ^ < j < d v . 



J 

r. 

K-i 



3) Based on the set of probabilities P^ 2 \ find the probability a^ +1 ^ from: 



1=1 
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TABLE II 

Sets that change in each half-round of each round at any iteration 



Rl 


R2 


HR1 


HR2 


HR1 


HR2 




/Q — > )Cj 




A, -> Aj- 



The initial set of probabilities P^ for Genie are as follows: 

p & 1)= (t) (« (0) Hi-« (0) ) dc ~ l , o<z<4- 

The set of probabilities Pj^' 2 ' 1 are calculated following step (2) by noticing that = a^°K 
C. Notation and Setup for the Analysis of LM and SBB 

In LM and SBB, at the beginning of any iteration £, the set of all variable nodes can be divided into three 
disjoint sets: K,^ , 1Z™\ and A^. The set fC^ consists of all unverified nonzero variable nodes, while 
the set A( £ ) consists of all unverified zero- valued variable nodes. The set includes all variable nodes 
recovered up to iteration £. Clearly, the decoder can not make the distinction between variable nodes in 
the sets and A^. The distinction between the two sets, however, is very useful in the analysis. 

Furthermore, at any iteration £, we partition the set of all check nodes into subsets N i * . The index i 
indicates the number of neighboring variable nodes in the set KS^ while the index j indicates the number 
of neighboring variable nodes in the set A^. Note that: 1) the degree of each check node in the subgraph 
induced by unverified variable nodes at iteration £ (reflected in the outgoing message of the check node), 
is % + j, and 2) the second coordinate of messages received by a variable node in the support set from a 
subset of check nodes all in the sets jVy , < j < d c — 1, is the same. 

In algorithms LM and SBB, each iteration consists of two rounds, each with two half-rounds. The 
configuration of the sets at the end of each half-round (1 or 2), each round (Rl or R2), and each iteration 
(£), is specified using the following 4 superscripts: (£, Rl, 1), (£, Rl, 2), (£, R2, 1), and (£,R2,2). In the 
first half-rounds (any round and any iteration), messages are passed from check nodes to variable nodes, 
while in the second half-rounds, messages are passed from variable nodes to check nodes. Also, with the 

(1 l) (2 I) 

definition of mapping functions $i, ' and ${, , the set of verified variable nodes in the first and the 
second rounds belong to the sets K,^ and A^, respectively. We have summarized in Table [n] the sets 
that are affected in each half-round (HR) of each round (R) at any iteration. 

In the Genie algorithm, the set K,\ f represents the set of unverified variable nodes in the support set that 
have i neighboring check nodes of degree 1. The definition of this set in the LM and SBB algorithms is 
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different, as explained in the following. The set /Q in the LM algorithm represents the set of unverified 
variable nodes in the support set with i neighboring check nodes in the set A^g. To define this set in 
the SBB algorithm, let A/? } := [jfS^ J\f^- R1,l) . With this notation, the set ICf in the SBB algorithm is 
defined as the set of unverified variable nodes in the support set with i neighboring check nodes in the 
set A/?. 

In order to track the evolution of the unverified support set, it is imperative to characterize the set of 
variable nodes in the support set that are recovered in each iteration. In Theorems [3] and [4] below, we 
characterize the verification of unverified nonzero variable nodes in the set K,^' in each iteration £ for the 
two algorithms LM and SBB. The theorems are proved in Appendices |n] and III 



Theorem 3. In the first round of any iteration £ in the LM algorithm, a nonzero variable node v E 
is verified if and only if it belongs to the set Uili )C\ e ' R1 ' 2 \ 

Theorem 4. In the first round of any iteration £ in the SBB algorithm, a nonzero variable node v E 

is verified if and only if it belongs to the set {jfl 2 K.f' R1 ' 2 ^ U K.f' R , where the set k!(' R1 ' 2 ^ consists of 

all variable nodes in the set fc^> R1 > 2 ') connected to the set Af^ 111 ' 1 ^. 



In the LM and SBB algorithms, unverified variable nodes with zero values are verified in R2. Note that 
a check node is zero-valued if it belongs to the set A/qj R2 , < j < d c . Therefore, for the verification 
of zero-valued variable nodes in the second round of iteration i, we divide the set of variable nodes in 
A^ into subsets A! , < i < d v with the following definition: a variable node in the set Ap has i 
neighboring check nodes in the set |ljj=i -Mi l f 2 Uji A^/^}, i- e -> me set of all check nodes which 
became zero- valued after HR1 of R2. In Theorem [5] below, we characterize the verification of unverified 
zero-valued variable nodes in the set A^ at HR2-R2 in each iteration £ of LM and SBB algorithms. 

Theorem 5. In the second half-round of the second round of any iteration £ in the LM and SBB algorithms 
a zero-valued variable node v E A^ is verified if and only if it belongs to the set Uf=i ■ 

We denote by Af^ R f the set of check nodes that are moved from Afj;\ 1,R2 '^ to A/^ m '^ in HR1-R1 
of iteration i. Similarly, the set of check nodes that are moved from A^^'^ 1 ' 1 to M^ R2,1 ^ in HR1-R2 of 
iteration I are denoted by M\^ R2 \ Since variable nodes in K, and A are verified through iterations, we 
always have j < i and hence the use of notation i i j. Moreover, we denote the set of variable nodes 
that are moved from K,f~ l,Rl ' 2) to JCf' Rl ' 2) and from Af~ hR2 > 2) to A { -' R2 ' 2) in HR2-R1 and HR2-R2 of 
iteration £ by /C-^ m ^ and A^.' R2 \ respectively. The notation i •f j implies that j > i. 

The sets that fully describe the state of the decoder at the beginning of iteration £ are: K."', TZ^\ A^, 
A^j l,R2 ' x \ fct 1 -- 1 ' 111 ' 2 ^ anc j ^(^- 1 . iJ 2,2) p or ^ ana iy S j s ^ we track the probability that a node (variable 

node or check node) belongs to a certain set at each half-round, round, or iteration. We use the notation 
to denote the probability that a variable node belongs to the set K^K For the rest of the sets, we use 
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the standard notation of probabilities that was applied in the analysis of the Genie algorithm. For instance, 
we denote the probability that a check node belongs to the set H^- R by pj^ 1 ' 1 ^. 

In the analysis, the goal is to find the recursive equations that relate these probabilities for consecutive 
iterations. As we shall see, the analysis of the decoding process for LM and SBB results in a system of 
coupled recursive update rules. Moreover, we show that the update rules at iteration I are functions of 
probabilities at iteration i — 1. Hence, the complexity of the analysis scales linearly with the number of 
iterations. In the following two sections, we present the update rules for the LM and SBB algorithms. 
The derivation of formulas are discussed in detail in Appendices In] and III 



D. Update Rules for the LM Algorithm 

In one iteration £ > 1 of the analysis of the LM algorithm, the following update rules are applied: 
1) Find the set of probabilities P^* 1 , 1 < i < d c , < j < d c — i, from: 



d c -i 

,Rl,l) _ n {£-l,R2,l) v {l,Rl) 
k=j 



n (£,m,i) _ sr^ n (e-i,R2,i) ( 



where, 



d c -i p {e-i,R2,i) 

i=0 j=l 



and P { 2 is given in ([19]). 
2) Find the set of probabilities P K [ ' < % < d v , from: 



where, 



p(/,ia,i) 



3) Find the updated probability that a variable node is in the support set and is not verified from: 



=i 
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(£ R2 1) 

4) Find the set of probabilities PyJ, fc , < j < d c , < k < d c — i, from: 



(£,_R2,i) = (e,m,i) (e,R2) 
i=j 



where, 



Note that: 



n (e,Ri,2) 



-/v 14.0,0 ' 14,1,0 

5) Find the set of probabilities P K A [ , < i < d v , from: 



where is calculated according to (fTSl) 



6) And lastly, find the probability that a variable node is zero-valued and unverified at iteration I, from: 

p (£+l) _ p (t) p (e,R2,aHR2) ^ 

We have the following initial probabilities, by letting a := denote the initial density factor. 



(1 - a) (l - (1 - a) dc_1 ) , £><°> = (1 - a) dc_1 =4> A« = (1 - a) (l - (1 - a)^ -1 )' 



Since all check nodes have degree d c in the subgraph induced by the unverified variable nodes at iteration 
zero, we have: 

n (0,R2,l) l d c \ , s.% n ^d c -i ■ n , 

P\ . , . = . (at) 1 - a) , z = 0, • - • , d r . 



and thus 

(i,m,i) _ p (o,R2,i) (i,ri) _ i . . . a n - n . . . // _ 

where, 

^ = (*; ! ) ( Am Y (> - ^ w )*^ • i=o,-,4-<. 

Since no element of the support set is verified at iteration zero, we also have: 

a M = a (o) = a . 

£. Update Rules for the SBB Algorithm 

In one generic iteration £, £ > 2, of the SBB algorithm, the following update rules are applied in the 
analysis: 
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1) Find the set of probabilities P^' 1,+ \ pftff' 1 '^ , and pftff^, 2 < % < d c , < j < d c - i, from: 



d c -l 



n (£,Rl,l,+) _ n (e-i,R2,i,+) n (e,m) 



k=j 
dc-l 



v (£,ri,i,c) __ sr^ v (e-i,R2,i,c) v (e,Ri) 



fc=3 

d c —i 



V (£,R1,1) _ ST^ v (£-l,R2,l) v (i,Rl) 



k=j 

where, Pff R ^., and parameters and D^~^ therein, are obtained from equations p7] ) and ( fT8| ). 
2) Find the set of probabilities pgf 1 ^, pj£ m ' 2 ' + ), p^ m > 2 > c \ and pg m - 2 ), 1 < j < <f„, from: 

p(e,Rl,2) _ 1 „(^-l,-Rl,2)^(l,.Rl) 



/Co " A]-(£,Ri) /Co /Cofo ' 

1 



n (e,Ri,2,c) _ 1 „(i-i,m,2)„(i,fli) /-, _ f {e,Ri,o^ 



„(£,ia,2) _ „(<?,m,2,+) (^,ri,2,c) 



where, 



J£,R1,1,+) 



d c — 1 <i c d c — i 

J£,R1,1) 



3=0 i=2 3=0 



and, 



p(4Ri,i,+) p(^,i?i,i,c) 

_ M,o f{£,Rl,C) _ M,o 



d c -l 

3=0 3=0 



and, 

yy(W) = p (<-l,Hl,2) p (£,ill) , p(£-l,Rl,2) (£,m) /, _ r(£,m,+)\ , p (^-l,fll,2) /-, _ f(<^l,C)\ 

3) The probability that a variable node belongs to the support set and remains unverified after iteration 
I, is calculated as follows: 

a {l+X) =a {l) N {l,Rl)_ 

4) Find the set of probabilities P^ 2,1) , P%™ ,1,+ \ P^ xc) , and P^ 2,1) , 2 < fc < d c , < j < d c - k, 



from: 



P 



e,R2,l) _ n (l,Rl,l) . n (l,R2,C,0) . T) (^,i?2,+,0) 



P 



No,: 



(e,R2,i,+) 



+ p 



+ X>. 



(l,Rl,l) v (e,R2) 



i=2 



i=2 



P 



(t,R2,l,C) _ p(t,R2,C,F) _j_ n (t,R2,+,F) 



+ P 



A/i, 



P 



(f,-R2,l) 



i=k 



where, 



V 



t,R2,+,F) _ p(t,Rl,l,+) 



p(^-i,m,2) (i.Jii) 

/Co A^oti 



P 



,i?2,+,0) 



P 



d v 

E 

i=2 



(£-l,_Rl,2)„(£,Rl) 



/c 



it; 



-P 



P 



(i,R2,C,F) 



P 



(£,m) 



ifi 



(£,_R1) 



it.; 



P 



,i?,2,C*,0) 



n {t,Rl,l,0) „(£,K2,C,F) 



d c — 1 d c — l 

(t,Rl,l,+) „(£,R1,1,C) 



(^,R2) 



(cw) fc (i-cw) 



i— k 



E (e,Rl,l,+) (£,. 



a: 



+ 



„(^-i,m,2) p(A-Ri) 

1 _ p(t,Rl) 
p (e-l,Rl,2) (t,Rl) ( dy ~ 1 



+ 



1 _ p&Rl) 

n (e-i,Ri,2) n (t,Ri) ( d v - I 



^ _ f{l,Ri,C)\ _ 



1 _ P^-Rl) 

5) Find the set of probabilities P A ^ , < % < d v , from: 



P 



,R2,2) _ ( a v 



d v —i 



where D« is calculated according to ( |T8| ). 
6) Find the probability that a variable node is zero-valued and remains unverified in iteration £ + 1 
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follows: 



„(<+!) _ „W„(^R2,2) 
— ^A 



The initial conditions in the SBB algorithm are for iterations and 1, and are given below. The update 
rules for iteration 1 serve as the initial conditions for iteration 2. 



. HR1-R1: 



where, 



= ( 4 ~ ^ (AW)* (1 - AW)*-* - ' , A (1) = (1 - (l - (1 - a<°))* s " 1 )*'" 1 



HR2-R1 (Regrouping variable nodes after the recovery): 

(i,m,2) _ 1 a _ 

- /\r(i,m) I 1 ^ J • 



where, 



and, 



p a,fli,2) = ^ 2<i<d v , 



d c -i p(i,-Ri,i) p (i,«i>i) 
= V f( 1 ' fll ) = M| ° 

J'=0 r,(l,fll,l) 



HR2-R1 (Recovering variable nodes based on ECN): 

a (2) = a (o) N (i,fli) ^(W) +p g,m,2)^ ^ 



. HR1-R2: 



(i,i?2,i) _ (i,m,i) p (i,-R2) U-9 ■■■ H i -0 ■■■ H - i 



i=k 
d c 



„(l,iZ2,l,+) _ „(l,m,l)„(l,-R2) 



i=2 

n (l,R2,l,C) _ „(l,iil,l)„(l,fl2) 



where, 
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V 



(1,R2) _ 1 0(1,^2) _ 



P 



P 




< j < d c - 1, 



0<j<d c -i 



C« = (i _ s(D)^- 1 + _ i) ^(i) (i _ B ^) dv ~ 2 (1 - f^) . 



HR2-R2: 



P% = (1 - a<«) (l - (1 - a^ 1 )* (1 - 



where, is obtained by ( fT8| ). 



VI. Noisy Measurements 



We adopt the following model for the case where the measurements are noisy p7|: 



c = Gv + n. 



In this new model, v and G are the original signal and the sensing matrix, respectively. The new term, n, 
represents the noise vector added to the noiseless measurements Gv, resulting in the noisy measurement 
vector c. Elements of the noise vector are assumed to be i.i.d. Gaussian random variables with mean 
and variance a 2 . The addition of noise to the measurements results in the following two probabilities to 
be zero: 1) the probability of having a zero measurement, and 2) the probability of having two equal 
measurements. This will disable the ZCN and ECN rules in recovering the signal elements. Without the 
ZCN and ECN rules, zero-valued variable nodes are not verified, and consequently, no check node will 
have a reduced degree in the subgraph induced by unverified variable nodes. Therefore, the D1CN rule 
will also be ineffective. 

In the context of message-passing algorithms, there are generally two approaches to deal with noisy 
measurements. In the first approach, the original formulation of the problem is changed so that the noise 
is taken into consideration (25J, [ |26| . In the other approach, the algorithms are changed in order to cope 
with the presence of noise in the measurements [[3j. Although the first approach may result in lower 
reconstruction noise, it requires unbounded message size and is susceptible to approximation errors. The 
authors in pj instead equipped their VB algorithm with some thresholding techniques and proved that if 
the original signal is sparse enough, they are able to recover the location and the sign of the non-zero 
signal elements successfully. In what follows, we propose a similar thresholding technique to deal with 
the noisy measurements. 



Thresholding is a common technique in detection theory to deal with noisy measurements |38|. We apply 
this technique to VB algorithms by defining two thresholds e\ and e 2 . We use e\ to convert small noisy 
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measurements to zero; i.e., any measurement c, such that |c| < ei, is set to zero. We use e 2 as the 
acceptable tolerance for the equality of two noisy measurements; i.e., we consider two measurements c\ 
and c 2 equal if \c\ — c 2 | < e 2 . In this case, we assign ci and c 2 a new common value equal to (ci + c 2 )/2. 
While the scope of this work is not to optimize thresholds e\ and e 2 , our goal is to demonstrate the 
potential of thresholding in desensitizing the VB algorithms to the measurement noise. We explain this 
through an example and by comparing the performance of the SBB algorithm equipped with thresholding 
and two methods based on t\ minimization in the case where the measurements are noisy. 

Consider a signal of length n = 1000. We let the size of the support set, k, to increase from 10 to 150 
in steps of 10. For each such support size k, we pick k out of the n elements randomly, and assign to 
each element an even integer uniformly distributed in the range [—1000, 1000], independent of the value 
of the other nonzero elements. In the case of the SBB algorithm, the signal is measured through a (3, 6) 
unweighted bigraph. In the case of t\ -based algorithms, we use sensing matrices consisting of orthonormal 
columns with Standard Gaussian elements J37). In all cases, the number of measurements, m, is fixed 
at 500. Each measurement is independently contaminated with a Gaussian noise of mean and variance 
equal to a 2 = 0.25. 

For the SBB, we set both thresholds t\ and e 2 equal to 1.99. Since the value of nonzero signal elements 
are drawn from a finite alphabet and since the graph is unweighted, false alarm may occur with nonzero 
probability in the recovery process. In our simulations, we consider a recovery algorithm "successful" if 
it can fully recover the support set. 

The first l^-based recovery algorithm is the i 1 regularization method introduced in [[37). For the second 
algorithm, we empower the i\ regularization algorithm with the knowledge of the size of the support set. 
The algorithm thus keeps the k components that are the largest in magnitude and converts the rest to zero. 



To simulate the two £i-based decoders, we use the L1MAGIC package available in [39|. 



As the measure of performance, we consider the mean square error (MSE) between the original and the 
recovered signals. For each value of k, we perform simulations until we obtain 100 "successful" recovery 
instances. The results for the 3 algorithms are reported in Fig. [TJ where for each algorithm, MSE averaged 
over all simulated cases for a given value of k is shown. 

As can be seen, the SBB recovery algorithm significantly (by about two orders of magnitude) outperforms 
both Ei -based algorithms. 



VII. Simulation Results 



In this section, we present simulation results obtained by running the recovery algorithms over random 
biregular graphs to recover sparse signals of finite length n. We also present analytical results obtained 
through the mathematical analysis described in Section [V] for the asymptotic regime when n — > oo. This 



35 




Fig. 1. Comparison between SBB, £ 1 minimization, and modified l 1 minimization in terms of MS reconstruction error in the presence of 
Gaussian measurement noise with zero mean and variance a 2 = 0.25 (n = 1000, m=500). 

includes the success threshold of different VB algorithms over different biregular graphs. The comparison 
of asymptotic and finite-length results shows that there is a good agreement between the two for moderately 
large block lengths (n > 10 5 ). 

In all simulations, a signal element belongs to the support set with probability a^°\ unless otherwise spec- 
ified. Also, each nonzero signal element (variable) is drawn according to a standard Gaussian distribution. 
The biregular graphs are constructed randomly with no parallel edges and all the edge weights are equal 
to one. In each set of simulations, the sensing graph is fixed and each simulation point is generated by 
averaging over 1000 random instances of the input signal, unless specified otherwise. We repeated each 
simulation with different randomly generated graphs (with the same variable and check node degrees), 
and observed that the results were almost identical for every graph. Each simulation is run until the 
algorithm makes no further progress. In this case, if the signal is recovered perfectly, the recovery is 
called successful, otherwise a failure is declared. 

For the analytical results, based on the fact that or-*' is a non-increasing function of iteration number £, 
we consider the following stopping criteria: 

1) a W < 1(T 7 , 

2) «W > 1(T 7 and \a® - a («)| < lfr 8 . 

If the analysis stops based on the first stopping criterion, the algorithm is considered successful. If, on the 
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other hand, it stops based on the second criterion, the algorithm is considered unsuccessful and a failure 
is reported. To calculate the success threshold, a binary search is performed until the separation between 
the start and the end of the search region is less than 10 -5 . 

To motivate the use of recovery algorithms over sparse graphs, as the first simulation result, we present 
the comparison between the SBB algorithm and two benchmark ^i-based algorithms, i\ minimization [|2| 



and iterative weighted i\ minimization [40 1. The setup is as follows. For SBB, we choose a random (3, 6) 
biregular sensing graph with 1000 variable nodes and 500 check nodes. The sensing matrix used for the 
two £i-based algorithms consists of 500 rows and 1000 columns. The elements are initially i.i.d. standard 
Gaussian random variables. Then the rows are made orthonormal. 

The cost functions used in l\ and weighted t x minimization algorithms are \\v\\i := J2i \ v i\ an( ^ II Wv \\i : = 
^2iWi\vi\, respectively, where v is the original signal of interest with elements V{, and W is a diagonal 
matrix with positive diagonal elements Wi representing the weights. Weighted l\ minimization is an 
iterative algorithm in which the weights at iteration t (wf^) are updated according to = l/(\vf ~^|+e), 
where vf is the estimate of the signal element Vi at iteration t — 1. The weighted l\ is not very 



sensitive to the parameter e as noted in [40|. We found e = 0.1 is a good choice based on our simulations. 



Regarding the maximum number of iterations for the weighted £ 1 minimization algorithm, authors in [40 1 
show that as this parameter increases, better results are achieved, with the cost of longer running time. 
The improvement gained by increasing the number of iterations beyond 6 however, is negligible [ |40| . 
Therefore, in our simulations, we choose a conservative maximum number of iterations equal to 10. As 
l\ and weighted ii minimization algorithms output an estimate which is very close to the original signal, 
but not exactly the same, we declare a success for these two algorithms if the difference between every 
original signal element and its corresponding estimate is less than 10~ 2 . Lastly, we use the L1MAGIC 



package in [39| as the optimization engine for simulating l\ and weighted £ a minimization algorithms. 



To have a fair comparison, the same signal vectors are used for all the algorithms. We also choose the 
size of the support set deterministically, and let the size range from 10 to 300. For each support size, 
100 instances of the signal vector are generated. Each signal vector is then measured according to the 
corresponding sensing mechanism for each class of algorithms. The success or failure of the recovery 
algorithms over the resulting measurements are then averaged over the 100 instances, and plotted in Figure 
|2j In Figure [3] the average running time, in seconds, is plotted for the three algorithms. The algorithms 
were implemented in MATLAB and were run on a computer with an AMD Phenom 9650 Quad-Core 
Processor 2.3 GHz, 3 GB RAM and a Windows 7 operating system. As can be seen, the SBB algorithm 
recovers signals with more nonzero elements at a speed which is about 2 orders of magnitude faster 
compared to that of the t\ algorithms. 



For the next experiment, we apply XH, SBB and LM algorithms to four randomly constructed (5, 6) 
regular graphs with n = {3, 15, 100, 1000} x 10 3 . The success ratio of the algorithms vs. the initial 
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density factor a = are shown in Figure |4j From the figure, we can see that, for all algorithms, by 
increasing n, the transition part of the curves becomes sharper such that the curves for n = 10 6 practically 
look like a step function. In the figure, we have also shown the success threshold of the algorithms for 
(5, 6) graphs, obtained based on the proposed analysis, by arrows. As can be seen, the thresholds match 
very well with the waterfall region of the simulation curves. 




-SBB 

-WGightGd L1 



Fig. 2. Comparison between success ratios of 1%, weighted l\ and SBB algorithms for n = 1000, m — 500. 



In Table III we have listed the analytical success thresholds of the iterative recovery algorithms for graphs 
with different d v and d c values. The result for XH algorithm on (3, 4) graphs, and more generally for 
graphs with d v = 3, is missing as the algorithm performs poorly on such graphs^ For every graph, the 
Genie algorithm has the best performance. This is followed by SBB, LM and XH algorithms, respectively. 



Careful inspection of the results in Table III indicates that the oversampling ratio r a = d v /ad c , improves 



consistently by decreasing both d v and d c values. In fact, among the results presented in Table III the 
application of the Genie and SBB to (3,4) graphs results in the lowest oversampling ratio of ps 1.16 and 
ps 1.67, respectively. 



In Table IV we have listed the analytical success thresholds of the iterative recovery algorithms for 
graphs with compression ratio d v /d c = 0.5 and different d v and d c values. In general, as we decrease d v , 
algorithms perform better in terms of recovery capability^] This also implies that for a fixed compression 

l5 The reason is that for d v = 3, a variable node is verified with the common value of \d v /2] = 2 check nodes. However, if two nonzero 
variable nodes share the same two check nodes (a cycle of length 4 exists in the graph), then a false verification may occur. 

l6 These results are consistent with the results observed for the Belief Propagation (BP) decoding of binary LDPC codes based on biregular 
graphs. 
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Fig. 4. Success ratio of XH, LM and SBB algorithms vs. a = q (0) for (5, 6) graphs with n = {3, 15, 100 and 1000} x 10 3 . Analytical 
thresholds are shown by arrows. 



ratio, the oversampling ratio improves by decreasing d v and d c . 



In Tables [V] and VI , we have listed the number of iterations required for different recovery algorithms to 
recover signals with density factor equal to the success thresholds reported in Tables [HI] and [TV] minus 
0.0001, respectively. These results, which are obtained by the asymptotic analysis are in close agreement 
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with finite-length simulation results at block lengths of about 10 5 . These results indicate that with a few 
exceptions the better performance comes at the expense of a larger number of iterations. In particular, 
among the practical recovery algorithms, SBB requires the largest number of iterations for convergence. 

TABLE III 

Success Thresholds for different graphs and algorithms 



(d v , d c ) 


(3,4) 


(5,6) 


(5,7) 


(5,8) 


(7,8) 


Genie 


0.6474 


0.5509 


0.4786 


0.4224 


0.4708 


SBB 


0.4488 


0.3892 


0.3266 


0.2806 


0.3335 


LM 


0.3440 


0.2871 


0.2305 


0.1907 


0.2385 


XH 




0.1846 


0.1552 


0.1339 


0.1435 



TABLE IV 

Success Thresholds for different graphs and algorithms for fixed compression ratio r c = 0.5 



Graphs: (d v , d c ) 


(3,6) 


(4,8) 


(5,10) 


(6,12) 


(7, 14) 


(8, 16) 


Genie 


0.4294 


0.3834 


0.3415 


0.3074 


0.2797 


0.2568 


SBB 


0.2574 


0.2394 


0.2179 


0.1992 


0.1835 


0.1703 


LM 


0.1702 


0.1555 


0.1391 


0.1253 


0.1140 


0.1048 


XH 




0.1875 


0.1050 


0.1170 


0.0791 


0.0834 



TABLE V 

Number of iterations required for different recovery algorithms over different graphs to recover a signal with 

density ratio equal to the success threshold minus 0.0001 



Graphs: (d v ,d c ) 


(3,4) 


(5,6) 


(5,7) 


(5,8) 


(7,8) 


Genie 


106 


66 


66 


62 


55 


SBB 


655 


178 


165 


200 


344 


LM 


258 


139 


103 


126 


108 


XH 




63 


58 


54 


41 



TABLE VI 

Number of iterations required for different recovery algorithms over different graphs with fixed compression 
ratio r c = 0.5, to recover a signal with density ratio equal to the success threshold minus 0.0001 



Graphs: (d v , d c ) 


(3,6) 


(4,8) 


(5, 10) 


(6,12) 


(7, 14) 


(8, 16) 


Genie 


93 


69 


57 


50 


46 


41 


SBB 


247 


167 


172 


163 


127 


108 


LM 


142 


94 


136 


97 


55 


67 


XH 




64 


48 


38 


32 


28 



To further investigate the degree of agreement between our theoretical asymptotic analysis and finite- 
length simulation results, we have presented in Fig. [5] the evolution of with iterations £ for the four 
algorithms Genie, LM, SBB, and XH over a (5, 6) graph. For each algorithm, two values of are 



selected: one above and one below the success threshold presented in Table III The theoretical results 
are shown by solid lines while simulations for n = 10 5 are presented with dotted lines. As one can see, 
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the two sets of results are in close agreement particularly for the cases where is above the threshold 
and for smaller values of L 




15 20 25 

Iteration {t) 




10 15 20 25 

Iteration (£) 




-Theoretical Above Threshold 
-Theoretical Below Threshold 
-Simulation Below Threshold 
-Simulation Above Threshold 




Fig. 5. Evolution of ot^ vs. iteration number I for the four recovery algorithms over a (5,6) graph (finite-length simulations are for 
n = 10 5 ). 



Next, for different values of a^°\ we estimate the average fraction of unverified nonzero variable nodes 
using the analysis, and denote the value of at the time that the analysis stops (because one 
of the stopping criteria is met) as a^ stop \ These values are plotted vs. the corresponding values of cr ' 
in Fig. [6] for the four VB recovery algorithms over the (5,6) sensing graphs. In the same figure, we 
have also given the corresponding simulation results for two randomly selected (5, 6) sensing graphs with 
n = 10 5 and 10 6 . The simulation results for both lengths closely match the analytical results, with those 
of n = 10 6 being practically identical to the analytical results. We have indicated the success threshold 
of the algorithms by arrows. From the figure, it can also be seen that as increases and tends to one, 
the curves tend to the asymptote a 



(stop) 



a 



(0) 



As the last experiment, we compare the running time and the accuracy of the proposed asymptotic analysis 
against those of the differential equation approach presented in p3| . For comparison, a biregular (3,6) 
graph and the SBB algorithm were chosen. The binary search for the success threshold starts with the 
interval [0.2, 0.3] and ends when the separation between the start and the end of the search region in 
less than 10~ 5 . The analysis is implemented in MATLAB and executed on the same computer described 



before. Using the proposed analysis, we obtain the success threshold of 0.2574 in 23.1 seconds. Table VII 



summarizes the results of running the analysis of [23| on the same machine for different values of n. The 
reported thresholds increase with the increase in n. For n = 100, 000, the running time is roughly 100 
times that of our proposed method. Moreover, the obtained threshold of 0.2591 is only in agreement with 
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Genie 




0°25 



Fig. 6. Fraction of unrecoverable variable nodes for different recovery algorithms and different starting density factors over random (5,6) 
regular bipartite graphs. Each arrow represents the theoretical success threshold. 



the threshold of 0.2574, obtained by the proposed method, up to two decimal points. In fact, experiments 
similar to those reported in Fig. [5] reveal that the accuracy of the threshold obtained by the method of p3| 
is lower than our results. In particular, our simulations show that the SBB algorithm over (3, 6) graphs 
with n = 10 5 fails for = 0.259, which would imply that the threshold 0.2591 is only accurate up to 
two decimal points. 



TABLE VII 



Success threshold and running time of the analysis of [23] for SBB over a random (3, 6) regular graph 



n 


100 


1,000 


10,000 


20,000 


50,000 


100,000 


Success Threshold 


0.2465 


0.2577 


0.2589 


0.2590 


0.2590 


0.2591 


Running Time (seconds) 


1.1 


9.9 


103.9 


220.6 


647.4 


2044.1 
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Appendix I 

Detailed Description of the Analysis for Genie 

A. General Setup 

To derive the update rules in the analysis of the Genie, assume that we are at the start of the first half- 
round of iteration t. We thus have the probabilities Pj^ 1 ' 1 ^, and Px~ l,2 \ and are interested in deriving 
the same probabilities for iteration i + 1. The update rules relate the two sets of probabilities; one at the 
beginning of iteration t and the other at the end of iteration i. Hence, we first derive the update rules for 
iteration t, then we discuss the initial conditions for iteration 0. 

In the following, we use the notation s e to refer to the status bit in the message transmitted over edge e 
from variable node to check node. 

B. Derivation of Formulas 

When a variable node is recovered in the second half-round of iteration i — 1, the d v edges adjacent to 
the variable node carry the recovery message to the neighboring nodes. Therefore, check nodes neighbor 
to the recovered variable node face a reduction in their degree. We denote by P)j jXi the probability that 
the degree of a check node in the induced subgraph is reduced from % to j < % after the first half-round 
of iteration £. This happens if out of % edges emanating from the check node and incident to the set of 
unresolved variable nodes KM\ i — j of them carry a message from their variable nodes indicating that 
they have been recovered. 

On the other side of the graph, when a variable node in fcf~ 1,2 ^ (1 < i < d v ) is recovered, by definition, out 
of d v check nodes receiving the recovery message, i have degree 1 and d v — i have degree j (2 < j < d c ). 
In the asymptotic case, as n grows large, we may assume that for each recovered variable node, the set 
of i check nodes of degree 1 and the set of d v — i check nodes of degree more than 1 are distributed 
independently and uniformly with respect to the set of all check nodes of degree 1 and the set of all 
check nodes of degree more than 1, respectively. As one set contains check nodes of only degree 1 and 
the other contains check nodes with a variety of degrees, we shall differentiate between P$ jU for j = 1 
and j > 1. Once these probabilities are found, the new distribution of check node degrees P^j^ can be 
derived using the total probability law: 

j=i 

To find the probability P$ ., j > 2, we need the conditional probability that an edge connecting a check 
node in the set J\ff~ 1,V> and an unverified variable node, carries a recovered message to the check node in 
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the first half-round of iteration t. We denote this conditional probability by P d > v Assuming this probability 
is known, we have: 

i-l-JWl 1 "*)' i = 2,-,4, i = 0,-..,i. 
The probability can be computed as follows. 

p£i = = l|c e {^- 1 ' 1 >, • • • ,MV'%v g 

= 5>[„ G /C?- 1 ^^ G • • • ,^-^},r; G 



i=l 
d v 



Pr[c 6 {Mt">, ■ ■ ■ Mt 1A %' e Kt m ,v 6 K*'" 1 )] Pr[« 6 itf" 1 ' 2 ^ £ K<»- 



S Pr[c 6 {7V<'-«>,...,<-''"}|«eX;«-»] 



i=l 



E 



(l-Pr[ceAf, ( '" 1 ' 1 VeK( < - 1 >]) 



1 - pW 1 - pW 



1=1 

where, p^ is defined as the probability of the edge e being adjacent to the check node c G 
conditioned on the fact that the variable node v G /C^ _1 \ This probability can be calculated as: 

pW = Pr[cG^- 1 ' 1) |t;G/C^- 1 )], 

= f>[c G ^" 1 ' 1 >,t; G /C^t; G /C^ 1 )], 

i=0 

= £Pr[c G G /C?- 1,2 \« G /C^ 1 )] Pr[t; G /C^V G /C^], 

i=0 

eft, 

i=0 " 

Therefore, the probability can be simplified as follows. 

E p (/-1,2) _ V- J_ „(€-l,2) 
£« / / J K-i 
Al) i=i j=i 

d >! 1 - pW 



l_p^)_ pW 
(5-1,2) 
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By using Bayes' rule, the probability can be calculated more efficiently as: 



P 



{£) 



Prtce^- 1 ' 1 '!^^- 1 '], 

Pr[v G K^\c G Pr[c G A/? _1,1) ] 



Pr[v G /C^l 



(£-1,1) 



(20) 



In the Genie algorithm, all variable nodes neighbor to at least one check node of degree 1 are recovered. 
Therefore, all check nodes of degree 1 must be grouped as degree in the first half-round of the next 
iteration. Hence, we have: 

p$ = 1, p ( ff = 0. 
■/V14.0 ' A/14,1 

So far, we found the update rules for the first half-round of iteration i. In the second half-round, variable 
nodes receive messages from their neighboring check nodes. The degrees reflected in the messages may 
re-group some variable nodes. According to the verification rule in the Genie algorithm, the only unverified 



set of variable nodes is the set K, 



(£-1,2) 




Variable nodes in this set have no connection to check nodes 



in the set M) but d v connections to the sets {Af^ 1A \ ■ ■ ■ ,Hf Suppose v G JCt h2) . In this 

case, if one of the adjacent check nodes of v in {A/^ -1 ' 1 ^, • ■ • jA/^ -1 ' 1 ^} turns to a check node in 



v will move from /C 



(£-1,2) 



to Kf+i . This is shown in Figure 







/ M \ 




1 d„ - i - 1 


' 1 

< 




1 
1 


1 




1 

I ! Km / 



Fig. 7. A variable node in Id turns to a variable node in /C;+i. 



Due to the random structure of the graph assumed in the asymptotic case, the single edges connected to 
the set of check nodes N[ are uniformly and independently distributed with respect to the candidate 
edges adjacent to the set of check nodes {A/^ , ■ ■ ■ , A/"^ -1,1 ^}. 

In the Genie, the probability is defined as the probability of a variable node v G /C[f -1 ' 2 ^ turning to 
v G K!f ,2 \ As the set /C^ -1 ' 2 ^ 



P 



&2) 



P 



(0 
/C 



is also the unverified support set, we have: 



Pr|V G /Cf°|t; G Kt XA >* G =(7) (1 -P?) 



(£)\ d «-i 



J = 0, 



j 4j 



(21) 
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where pjp is defined as the probability that an edge adjacent to a variable node v G Kq : ' 2 ^ carries a 
message indicating that the adjacent check node c has a degree equal to 1; i.e., c G A/"/ . The probability 



Px- 1 is calculated as follows: 



i=0 

1=0 c i=l 

In the Genie algorithm, a variable node is resolved if among all its neighboring check nodes, there exists at 
least one with degree equal to 1. Hence, the probability of a variable node in the set KM' being recovered 
is calculated as X^Ii^jcf^- Therefore, according to the total probability theorem, the probability of a 
variable node v remaining unverified, i.e., v G K^ +l \ is: 



i=i 



C. Initial Probabilities for the Genie Algorithm 

In the first half-round of iteration 0, all check nodes have degree d c in the subgraph induced by the 
unverified variable nodes. This degree is reflected in their outgoing messages. These messages carry no 
information to the variable nodes, and no variable node can be resolved based on them. On the other 
hand, variable nodes not in the support set are resolved with a value equal to 0. Assuming an initial 
density factor of a^°\ 1 — cr°) fraction of the messages from variable nodes carry a recovery notice to 
check nodes. Since at iteration no variable node in the support set was verified we have = a^°\ 
Hence, = 1 — Since pjy ^ = 1, the set of probabilities pfy % is given by the following. 



= (0,1) CO CO <i<d c 

d - \ f r W V c ^ ft JO 
d c - i 
d c 
i 



>1 \ x ^d>l 



< i < d c 
(a«)'(l-a«)*'" i , 0<i<d c . 



To find the probability P^f\ we first find the probability P^ from (22) and then replace it in the (21 ). 
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Appendix II 

Detailed Description of the Analysis for LM 

We first prove Theorem [3] in which the verified variable nodes in the support set are characterized. 

Proof of Theorem ^ Suppose that we partition the set of unknown variable nodes in the support set 
K.^ into subsets K,f , where % represents the number of messages received by the variable node indicating 
that the transmitting check node has degree 1; i.e., i neighboring check nodes are in the set A^q. Since 
check nodes in A/i o have both degree 1 and a non-zero value, each variable node v in the set Uti 
has at least one non-zero check node of degree one as a neighbor. Recall that in each iteration of the LM 
algorithm a non-zero variable node is verified if and only if it is connected to a check node of degree 
one. Therefore, v £ U^=i m e ans that v is recovered at iteration £ of the LM algorithm. 

To prove the converse, we assume that a variable node v is resolved at iteration £ of the LM algorithm. 
We show that v £ K-®- The fact that v is resolved at iteration £ of the LM algorithm implies that it 
was connected to at least one check node of degree one. As we dealt with all zero-valued check nodes 
in the previous round, the degree-one check nodes must have a non-zero value. Therefore, the variable 
node v is, by definition, in the set IJ*^ K,f\ ■ 

A. Recovering Variable Nodes 

• A variable node in /C is recovered if it is connected to a check node in A/^o- 

• A variable node in A is recovered if it is connected to a check node in U^Li-M),^ i- e -> a zero-valued 
check node. 

B. Iteration Zero 

The messages passed in the first round of iteration zero do not contribute to the verification of variable 
nodes and do not change the grouping of the check nodes. Iteration zero thus consists of only one round 
and hence, two half-rounds. In the first half-round, check nodes pass their values along with their degrees 
(d c ) to their neighboring variable nodes. In the second half-round, variable nodes process the incoming 
messages. A variable node is verified if it receives at least one message with a value equal to zero. In 
this case, the variable node is verified with a value equal to zero according to the ZCN rule. The set of 
all variable nodes verified in this half-round make the set TZ^ l \ 

Let denote the set of check nodes with i neighboring variable nodes in the support set. The probability 
P$ defined as the probability that a check node belongs to the set A/$ is calculated as follows: 

P$ 4 Pr(c £ A/f) = (a<°>) 4 (1 - a<°>)^ , i = 0, ■ • • , d c . 
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Hence, the probability that a check node c has a value equal to zero (c = 0) is: 

Pr[c = 0] = P$ = (l -« (0) ) dC - 
Let Aj 0,fi2 ' 2 ^ denote the set of zero-valued variable nodes that receive j zero-valued messages. The 

(0 -R2 2) (0 2") 

probability P^ A [ ' ' defined as the probability that a zero-valued variable node belongs to the set ' ' ' 
is calculated as follows: 

pjowo = Pr[v e A (ow)| w ^ ^(o)] = Q ppy ^ _ p f)^- j ; ^ = o, • • • ,4, 

where is the probability that an edge adjacent to a zero-valued variable node carries a message with 
value zero, and is calculated as follows: 

pf = (l-a®) de ~ 1 . (23) 

Let P A denote the probability that a variable node has a zero value and does not receive even one message 
with value equal to zero in the second half-round of iteration zero. We have: 

p£ } 4 Pr[u G A«] = Pr[t; £ /C<°>] Pr[ W e A^V £ AC^] = (1 - a (0) ) (l - (l - ar<°>) < *°~ 1 )*' . (24) 

Since no element of the support set is verified at iteration zero, we have K,^ = KS l \ and hence, 

a« = Yi[v e £«] = 

At the end of iteration 0, all check nodes have degree d c in the subgraph induced by the unverified 
variable nodes. Thus, the set of all check nodes can be partitioned into subsets Mf^^f^, where i denotes 
the number of neighboring variable nodes in the support set (0 < i < d c ). The edges adjacent to a check 
node are partitioned into two sets: /C-edges and A-edges. /C-edges are connected to variable nodes in the 
support set, while A-edges are connected to zero- valued variable nodes. Therefore, a check node in the 
set J\f i J 1 ^^ (0 < i < d c ) has i, /C-edges and d c — i, A-edges. 

C. Iteration One and Beyond 

Here we present the analysis for iteration one. Since the analysis of the second iteration and beyond is 
similar to that of iteration one, they are omitted. The summary of the formulas can be found in Section 

The verified messages sent from variable nodes to check nodes at the end of iteration zero, are processed at 
check nodes at iteration 1, HR1-R1. Based on the recovery process at iteration zero, all verified messages 
are sent from variable nodes in the sets Aj, 1 < j < d v . We partition the set of edges adjacent to a 
variable node in the set Aj, < j < d v , into A/"=o-edges and A/^o-edges. Edges in the set A/"=o-edges are 
connected to zero-valued check nodes (check nodes in the set A/q), while edges in the set A/^o-edges are 
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connected to non-zero check nodes. 

Since check nodes in the set A/"o ' H2 receive d c verified messages, we are interested in the set of check 
nodes that are regrouped from M^^ 2 ^ to H^ Rl,r> in HR1-R1 of iteration 1 for 1 < i < d c . Such a 
set of check nodes is denoted by A/^^^-. The messages responsible for such regrouping are carried 
over A/^o-edges. To analyze the regrouping, we need to find the probability Vz R of an edge in the set 
of A-edges to carry a verified message. Such edges are not connected to the set A . Before finding the 
probability P £r , we introduce two notations v e and c e to denote the variable node and the check node 
connected by means of the edge e. Now, we have: 



£r 

= 1 



Pr[c e i M^\v e i JCW] 
Pr[v e € A^ R2 ' 2) \v e j K^\Vi\c e jM^\v e j K^\v e £ A 



1 - 



j 

(0,R2,2)-i 
\_ 



p (0,R2,2) x ° x 



p(o, ( d — i 



X 



d, 



(0,R2,2) (l) 
1 _ p(°) 1 _ p(°) ' 



where and are given in §2A\ and ( |2~3| ), respectively. We thus have the following regrouping of 



check nodes based on the second index: 

i,.Ri) = (dc ~ A ( 1 _ p (i,Ri)) 

A,dc-iij \ j ) \ £ « / V t R 



Hence, 



^ ,1,=P S?>Si^' = A, J = 0,---,4-,. (25) 



In HR2-R1, messages are sent from check nodes to variable nodes. At this stage, check nodes in the 
set J\fiQ R transmit a message with its first coordinate equal to 1. Variable nodes in the support set 
that receive at least one such message, are verified with the value contained in that message in HR2-R1. 
After processing the received messages, we divide the set of all variable nodes in the support set KS l > 
into subsets K^ R , < i < d v , where i denotes the number of degree-1 messages a variable node 
receives. We denote the set of such variable nodes by K^1 R1 \ Let p( l i?1 ) denote the probability that an 
edge adjacent to a variable node in the support set carries a message with the first coordinate equal to 1. 
Using the same notations v e and c e defined above, we have: 

P ^ = ^[c e eM^ l) \vee^% (26) 
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Pr[c e e Af$ R1A) ] N^f g e Affl^j ,„ 
Pr[ Ue e CT] ' 1 j 



(!) aWdr ' 



(28) 



Hence, the probability P^'^ that a variable node v G /C^ belongs to the set JC^' R1 ' 2 ^ is calculated as 
follows: 

pg* 1 ' 2 ) = p^ 1 ) 4 Pr(t; G /cf' m ' 2 V G /C«), (29) 
= ^)(p(W) i (l-p(^))*'- i , < = 0,...,d„. (30) 

Based on the D1CN rule in the LM algorithm, variable nodes in the set \J d j l l K,f ,Rl ^ are verified. 
Therefore, the probability that a variable node in the support set remains unverified for iteration 2 is as 
follows: 

« (2) = « (1) (i-XX;* 1, 

V i=i 

With this, the first round of iteration 1 is over. In HR1-R2 of iteration 1, check nodes receive messages 
from variable nodes. At this point, check nodes should be regrouped based on their first index, as some 
variable nodes in the support set have been verified at HR2-R1. A check node in the set Af^ R1 '^ is 
regrouped into the set A/^'^ 2 ' 1 ^ at HR1-R2, if from i edges in the set of /C-edges adjacent to the check 
node, % — j of them carry a verified message. The set of such check nodes are denoted by A^* . 
To analyze this regrouping, we need the probability p^' R2 ^ that a /C-edge carries a verified message at 
HR1-R2. To find this probability, we proceed as follows: 



P^ R2) = 1 - PrK G G K« c e i <o m,1) ], 

Prfr e G G JCW] Pr[c e j M[]^ l) \v e G jgU G tf^j 

Pr[c e £ A-ft^Ve G /CW] 
PrK G ^^^Ve G Pr[c e £ A/ft^V e G ^' m ' 2 )] 



1 



1 - 



1 - 



G ^ (1 ' m ' 2 Ve G Pr[Ce £ Aft^Ve G G '^J 

pjWa.2) x x 



d v 



(l,fll,2) / 



1 - 



, D (l,fll,2) 



(i,m,2) 



i=0 



1 - 



p (i,m, 2) 

/C.Q 

1 _ ' 
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where P^f 1,2 ^ and pt 1 ^ are given by ([30]) and ( [28] ), respectively. Hence, the probability P^f^fl mat a 
check node belongs to the set A/^'f 2 ^ is calculated as follows: 

c;=Q(* R2, ) i "( i - p "' R2, y- *=».••■.<. *=o,-,*-«. 

Note that by construction, we have = 1 and Pfyffl = 0. After the regrouping, the probability 

pfy^' 1 ^ that a check node belongs to the set Afj is calculated by: 

11 = e^'ms. i=o,-,*, *-<>,•••,*-<. 

t=j 

The measurement corresponding to check nodes in the set , 1 < fe < d c — 1, changes to zero, 

as the check nodes are no longer connected to an unverified variable node in the support set. Hence, the 
messages transmitted by such check nodes have a value equal to zero, which in turn verifies some variable 
nodes (in the set A) at HR2-R2 of iteration 1. This is indeed the last step in iteration 1. 

As explained before, the probability p^' R2 ^ is needed to find the probability that a zero-valued variable 
node is verified at this stage. This probability is calculated as follows: 

r ) = E Pr [^C 2,1) K 6A(1) ]> 

3=1 

d ^ Pr [c e e <^ 2 ' 1} ] Pr[v e e A« | Ce G Af^^ 



i=0 j=l 



3 (1,-R2,1) / j_ 

^ U 

^L—/ d c d c — 1 



-1 



j=l ^ ^ ^(1,^2,1) / J_ 



i=0 j=l 



d c -l 7,(1.-^2,1) 
J ci c dc-l 



i=0 j=l 



Note that the denominator is indeed Pr[v e G A^] = P^ . Hence, the probability p^^ 2 ' 2 ^ as defined 
before, is calculated as follows: 

p a,*2,2) = ^ ^ _ p a.«o)*'- i i i = o,... , dv . 

Lastly, the probability that a variable node is zero-valued and remains unverified for iteration 2 is as 
follows: 

D (2) _ D (1) D (1,B2,2) 
^A ^A y A 
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Appendix III 
Detailed Description of the Analysis for SBB 

Proof of Theorem 4\ First we show that if a variable node v G (jfca fcf ,m,2) U tf' R1,2) , it will be 
recovered in the SBB algorithm at iteration £, HR2-R1. A check node in the set M\ is only connected to 
one element of the support set. So, a variable node v in the support set with i neighbors in the set Mi 
receives i messages from the neighboring check nodes with the same value. Hence, variable nodes in the 
set U&2 K,f ,R1 ' 2 ^ are recovered. Variable nodes in the set K,f' R1,2 \ however, can not be verified according 
to ECN rule because they are not connected to at least two check nodes with the same value. So, the 
only verification rule applicable would be the D1CN. With the definition of the set K\ ' ' , a variable 
node v G K\' is verified based on D1CN at iteration £ of the SBB algorithm. 

To prove the converse, we have to show that when a variable node v is verified with a non-zero value at 
iteration i of the SBB algorithm, then we have v G U^2 K.f' R1 ' 2 ^ U ICf' R1 ' 2 \ The statement is true based 
on the definition of the sets and the fact that false verification happens with probability zero. ■ 



A. Verifying Variable Nodes 

• A variable node in K\ is verified if it is connected to a check node in Mi$. 

• A variable node in /Q, 2 < i < d v , is verified if it is connected to a check node in Mi = IJ^q 1 jVij. 

• A variable node in A is verified if it is connected to a check node in Uti-M),^ i- e -> a zero-valued 
check node. 



B. Iteration Zero 



The process for iteration zero is the same as that of the LM algorithm discussed in II-B The analysis is 
thus not repeated here. 



C. Iteration One 

In the first half-round of the first round (HR1-R1) of any iteration, check nodes process the received 
messages from variable nodes and generate the outgoing messages accordingly. Note that HR1-R1 in the 
SBB and the LM algorithm are the same. 

In this section, we adopt the notation Mi, with any superscript, to denote the set {Jj^Mij. The verified 
messages sent from variable nodes to check nodes at the end of iteration zero, are processed at check 
nodes at iteration 1, HR1-R1. Based on the recovery process at iteration zero, all verified messages are 
sent from variable nodes in the sets Aj, 1 < j < d v . We partition the set of edges adjacent to a variable 
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node in the set Aj, < j < d v , into A/"=o-edges and A/^o-edges. ACo-edges are connected to zero-valued 
check nodes (check nodes in the set A/o), while A/^o-edges are connected to non-zero check nodes. 

We are interested in the set of check nodes that are regrouped from A/^^ to A/^'^ 1 ' 1 ^ for 1 < i < d c , 
and eventually the probability pff R1,1 ^ in HRl-Rl of iteration 1. The calculation of this probability is 



exactly the same as the derivation of (25) in the analysis of the LM algorithm 



In HR2-R1, the following 2 types of variable nodes in the support set are verified: 

1) variable nodes neighbor to at least one check node of degree 1; i.e., variable nodes that receive at 
least one message with the first coordinate equal to 1. These variable nodes are verified with the 
value contained in that message based on D1CN. 

2) variable nodes neighbor to at least two check nodes in the set A/^ ' ; i.e., variable nodes that 
receive at least two messages with the same value. These variable nodes are verified with the 
common value of the messages based on the ECN rule. 

Therefore, after processing the received messages, we divide the set of all unverified variable nodes in the 
support set /C^ into subsets K.!- 1 , < % < d v , where i denotes the number of neighboring check nodes 
in the set A/^ 1 ' . We denote the set of such variable nodes by K.fy R1 \ It is worth mentioning that some 
variable nodes in the set K.f~' R1 ' are verified according to D1CN and ECN. The remaining unverified 
variable nodes, after the recovery, make the sets K.j R1 ' 2 \ < j < d v . The sets }C\ 1,m ^ are removed 
from the summarized formulas in Section V-E to prevent any confusion. They appear here because they 
simplify the notations and explanations. 

Let p( 1>m ) denote the conditional probability that an edge is adjacent to a check node in the set J\f^' Rl,x ^ 
given that it is adjacent to a variable node in the support set. Using the same notations v e and c e defined 
before, we have: 



= p r [ Ce e A/f'^Ve e 



Pr[c e G A/i (1,m,1) ] Pr[v e G £W|c e G A/f'^] 



Pr[v e G /C«] 

Hence, the probability P^'^ that a variable node v G /C^ belongs to the set K,f ,Rl " > is calculated as 
follows: 



d. 



n (v^ ri) ) 1 (i - p^ m) ) dv '\ i=o,---,d v . 
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Based on the ECN rule, variable nodes in the set \jfl 2 ^t'^ ar e verified. A fraction f^ R1 ^> of variable 
nodes in the set K^' R1 ^ that receive a message with the first coordinate equal to one are also verified 
based on the D1CN rule. Using the Bayes' rule, this fraction is calculated as follows: 

= Pr[c e G A/fo^V G JC?> m \c e G <« JB ' 2 >]. 
By omitting the superscripts, we obtain: 



Pr[c e eM,o] Pr[u e e/Ci| 


c e G A/i,o] P^[c £ 


:GMl 


c e G Afi, ,v e G /Ci] 


Pr[c e G A/i] Pr[w e G /C^ 


c e G A/i] 



pgff' 1 * x X x 1 
"m x a 

„(i,m,i)' 

where we have used the fact that Pr[t> e G /Ci|c e G A/i,o] = Pr[i> e G K\\c e G A/i] = A. Therefore, the 
probability that a variable node in the support set remains unverified for iteration 2 is as follows: 

a (2) = a (i) ^ _ _ j2p^ r1) 



The final regrouping of variable nodes into sets fC\ ' , < « < d„, is performed by taking into account 
the verification of some sets of variable nodes. We have: 

„(i,m,2) _ 1 „(i,fli) 

D (i,fli,2) _ 1 f (i,m)\ D (i,fli) 

- /\rfi,i?n I 1 ■/ ) y K. x ■ 



p (i,m,2) = ^ 2<i<d v . (32) 



Ar(i,«i) 

The normalization factor j\K 1,m ) is used to make the set of parameters Pjq^ 1 ' 2 ^ a valid probability measure, 
and is calculated as follows: 

,(2) 



(I)' 

With this, the analysis of the first round of iteration 1 is over. In HR1-R2 of iteration 1, check nodes 
receive messages from variable nodes. At this point, check nodes should be regrouped based on their 
first index, as some variable nodes in the support set have been verified at HR2-R1. Since a fraction of 

fl R\ 2) 

variable nodes in the set K\ ' ' ; are left unverified, unlike the LM algorithm, not all check nodes in 
the set J\f[ 1 J R1,1 ' > (1 < j < d c — 1) are regrouped into the set A r j'' R2 ' 1 ^; some will stay in the same set 
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. Hence, in addition to analyzing the set of check nodes M^.^ that are regrouped from Af^' m '^ 



to j\f^- R , we also have to analyze the set of check nodes that are regrouped from N{~J Rl '^ to 

r(l,j '' 
O.J 



j, • , wc aisu nave iu aiiaiyz,c uit; sci ui v^iic^js. iiuucs jv^q ■ uicu cue icgiuupcu iiuiii J\ X j 



Suppose that edges adjacent to a check node are partitioned into two sets: /C-edges and A-edges. /C-edges 
are connected to variable nodes in the set KS X \ while A-edges are connected to zero-valued variable 
nodes. To analyze the regrouping of check nodes in the sets j\f^ ,R1,l > and J\[!> 1,R , 2 < % < d c , we need 
the probabilities ?^=i and P^ R2 ^ defined as follows. The probability P^J^ 2 ^ is defined as the conditional 
probability of an edge carrying a verified message given that it is a /C-edge adjacent to a check node 
in the set jV x R1,1 \ The probability Pjj^ 2 ^ is defined similarly with respect to the set of check nodes 
in Ui*=2 • To find the probability P^=i ' we shall consider only the variable nodes in the set 

Ui^2^i R1 ■ This is because, variable nodes in the set K.^' 1 * 1 ' 2 ^ are connected to check nodes in the 
set A/iq^ 1 only. Let f e denote the status flag (f e G {0, 1}) of the message carried over the edge e. We 
proceed as follows: 



= Pr[v e g [J Kf^\v e e K« c e e M^ R1 \ 

= 1 - Pr[J e G ^'^Ve G K« c e G A/f^], 

= x _ Pr[t; e € /Cj^V e G /C«] Pr[c e G A/f'^V 6 /Cj 1 '^, ^ e G KP 



Pr[c e eAfi 1,R1,1) KelCM} 



p£l R1 ' 2 h/d v 

p (l,Rl) ■ 

(i,m,2) 



d v P^ R1 ) ' 



where p^f 1 ^ and P^* 1 ) are given by ( |32| ) and ( [31] ), respectively. Using the same approach, the calcu- 
lation of the probability P d l x follows. In each step, we omit some of the superscripts to simplify the 
presentation. 



d c 



3=1 

d c d c 

= 1-PrK G K,$' R1 ' 2) \v e G /C«,c e G U-^] -Pr[«e e /C?'^, / e = 0|v B G /C«,c e G |J^-], 



3=1 3=1 

d c 

1 — Pr[f e G /C |f e G /C, c e G |J A/} 



3=1 

d c 



Pr[t» e G K x \v e G /C,c e G (jA^]Pr[/ e = 0|w e G £i,u e G /C,c e G |JA^], 

i=2 j=2 
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, Pr\ve e IC \v e e JC] Pr[c e e U^^K e IC ,v e e K] 

Pr[c e £U-= 2 A/> e £/C] 

_ Prfr e £ /C^e £ K] Pr[c e e Lfc 2 JV> e e K x ,v e e K] 

Pr[c e £U-= 2 A/> e £/C] 

. _ PrK € /Co|^ e £ /q Pr[c e £ U^A^K g go^ £ g] 
" ' 1 - Pr[c e £ A/> e £ /C] 

Pr[t; e £ /Ci|u e £ K] Pr[c e £ Ut 2 ^>e £ K u v e £ /C] 



(1 " f^) , 



i-Pr[ce eMK e £] 



(l - / {1 ' m) ) , 



,(i,ra,2) 



(i,m,2) P Kl . 

l_p(l,fil) ' ' 



cL - 1 



Hence, the probabilities Pfy^ and P//^ that a check node belongs respectively to the set of check nodes 
■^ilCP anc * ^ilCP dXQ calculated as follows: 



pSS = 1 ' j = i,-..,^-<- 



After the regrouping, the probability Pfyf^' 1 ^ that a check node belongs to the set N^- R2,V> is calculated 
as follows: 

d c 

„(1,K2,1) _ „(l,iil,l)„(l,R2) , _ p. , . _ p. , . 



The measurement corresponding to check nodes in the set A/q"^'" R2 ' 1 \ 1 < A; < g? c — 1, changes to zero, 
as the check nodes are no longer connected to an unverified variable node in the support set. Hence, the 
messages transmitted by such check nodes have a value equal to zero, which in turn verifies some variable 
nodes (in the set A) at HR2-R2 of iteration 1. This is indeed the last step in iteration I. 

As defined before, the probability p^' m \ is needed to find the probability that a zero-valued variable 
node is verified at this stage. This probability is calculated as follows: 

_ g Pr[c e £ </ 2 ' 1} ] PrK £ A (1) |c e £ J^f^] 

i=0 j=l 
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„(l,fl2,l) / 3 
" ! ^ { d, 



E 



d„ <Z n -l 

J U 



i=o i=i 

^ d c d c -l 
i=0 j=l 



Note that the denominator is in fact Pr[v e e A^] = P^. Hence, the probability P^!' R2 ' 2 \ defined as 
the probability that an unverified zero-valued variable node belongs to the set A- ' ' ', is calculated as 
follows: 



Lastly, the probability that a variable node is zero-valued and remains unverified for iteration 2 is given 
by: 

D (2) _ D (1) D (1,B2,2) 
r A r A r A 



Z). Iterations Two and Beyond 

At iteration 2, HR1-R1, the regrouping of check nodes based on their second index is similar to the 
process in HR1-R1 at iteration I. We have: 

D (2) 

p (2,Rl) = 1 _ 
£ fl ]_ _ p(l,R2) • 

For the regrouping of check nodes based on the second index we thus have: 



Hence, 



d c -i 

(2,m,l) _ p (l,«2,l) p (2,fli) ,■ - 1 . . . w A- - n • ■ ■ d — 7 

j=fc 



At iteration 2, HR2-R1, variable nodes in the support set IC^ should be regrouped. Note that the set 
consists of two partitions: K.q~' R1 ' 2 ^ and )d[' R1 ' 2 \ Since variable nodes in the set K,^' R1 ' 2 ^ are connected 
to check nodes in the set J\f^ 2 ' R1,1 \ and since the regrouping of the variable nodes is based on the number 
of their neighbors in the set J\f^' Rl,1 \ we shall partition the set of check nodes in J\f[ 2,R1,V> into two sets: 
jy(2,m,+) an( j jy(2,m,c)^ QYieck nodes in the set J\f^ 2 ' R1 ' + ^ were regrouped into the set J\f^ ,R2 ^ from all 
the other sets JV^ 1 ' 111 ' 1 ^, 2 < % < d c , at iteration 1, HR1-R2. Check nodes in the set J\f^ R1 ' c \ however, 
were regrouped into the set M^' R2,V> from the set M[ 1,R1,1 \ Edges in the set J\f[ 2 ' R1 ' + ^ are responsible for 
the regrouping of variable nodes at iteration 2, HR2-R1. We denote by p^' R1 ^ the conditional probability 
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that an edge is adjacent to a check node in given that 1) it emanates from an unverified variable 

node 2) it is not adjacent to a check node in the set . This is indeed the probability that a variable 

node has an edge that increases its index. This probability is calculated as follows: 

pf' m) = Pr[c e g A/f m ' +) k e )C^\c e i M 2 ' m ' C \ 

_ Pr[c e G A/f m ' +) ] Pr[v e E IC^\c e E A/f'^] Pr[c e j M^ C) \v e E IC^\ c e E A/f m ' +) ] 

¥T[v e EW),c e £Nl 2 ' R1 > C) ] 
Pr[c e G M[ 2 ' R1 ' +) ] PrK G IC^\c e E A/f'^] Pr[c e g A/f^K G £( 2 >, c e G A/f m ' +) ] 

PrK G /C<*U G U^A/f^ UA/f m ' +) ] 
Pr[c e G A/f m ' +) ] Pr[v e E K^\c e E A/f' m ' +) ] Pr[c e $ M?' m ' C) \v e E /C< 2 >, c e E A/f m ' +) ] 



Pr[c e G Afi 2 ' R1 ' +) ] Pr[v e E m\c e E A/f + ]T Pr[^ (2 ' m ' 1} ] Pr[t; e G K^A/?' 



x 1/4 



i=2 



d c 

(2,fll,+) . V^.„(2,fll,l) 



i=2 

where p^ R1 < + ^ denotes the probability that a check node belongs to the set Af^ 2 ' R1 ' + \ Since the two sets 
A/f m ' +) and A/f m ' C) are disjoint and their union is the set Af^ 2,R1,1 \ we have: 

d c d c —j 

„(2,m,+) _V^V^ „(l,fll,l)„(l,.R2) „(2,fll,C) _ „(2,itt,l) „(2,fll,+) 

j=2 i=0 

Hence, the probability P^.'^, i G {0, 1}, that a variable node from /C- 1 '^ 1 '^ is regrouped into ICf' R1 ' 2 \ 
i < j < d v , is calculated as follows: 

c j = (";_"•) (»fr i 1 - ^"f • * = °> '•=<•-.<«.■ 

Finally, the probability V^ Rl ^ that a variable node in the support set belongs to the set /Cj 2,fl1 ^ is calculated 
by: 

i 

(2,m) _ (i,m,2) (2,m) . _ n , 



i=0 



The probability p£' R1 ' 2 ^ that a variable node in the support set belongs to the set JC^ 2 ' R1 ' 2 \ is calculated 

(2 R\) 

based on the set of verified variable nodes at this stage. Variable nodes in the set KL -' , 2 < j < d v , are 
all verified. Variable nodes in the set K Q ' are left intact, and a fraction of the variable nodes in the set 
/Q ' ' are verified. The procedure to find this fraction is as follows. 
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The set /cf' m ^ consists of two sets of variable nodes: /C^'/^ and K,f^ l \ Variable nodes in the set 
/C^'f^ are neighbor to check nodes in the set J\f^ 2 ' R1 ' + \ while variable nodes in the set /C^'/^ are 
neighbor to check nodes in the set Since the structure and evolution of the two sets 

and are different, the sets K^ RV> an ^ ICf^ R1 ^ also evolve differently. The sets J\f^ 2 ' R1 ' + ^ and 

are formed at iteration 1, HR1-R2. We shall partition the two sets further into subsets -Af±J 
and J\f^J Rl,C \ < j < d c — 1. At iteration 2, HR1-R1, the subsets are to be regrouped based on their 
second index, just like any other set of check nodes. Hence, we have: 



(2,m,l,+) _ D (l,fl2,l,+) V (2,R1) , _ n , 1 

j=fc 

(2,m,l,C) _ (1„R2,1,C) D (2,fll) , _ n , , 



where, 



and, 



j=fe 



(2) 

,(2,m) = (j\(i_ p (2,fli)V' ^(2,m)V'~ fe (2,m) _ , P A 



5 



„(l,fl2,l,+) _ V^„(1,R1,1)„(1,R2) 
k=2 

„(l,fl2,l,C) _ „(l,m,l)„(l,il2) 

Aij- ^in- 



variable nodes in the two sets /Cq^'/^ and /C^' 1 i?1 ' ) are verified at iteration 2, HR2-R1, if and only if, they 
are neighbor to check nodes in the sets Af^ 111 ' 1 '^ and J\f[ 2 ^ R1,1 ' C \ respectively. Therefore, the parameters 



y(2,m,+) an( j j(2,Ri,c) ^ (j e fl nec i as t ne respective fraction of variable nodes in KlQ^ and K.^^ that are 



verified at iteration 2, HR2-R1, are calculated as follows: 



f(2,Rl,+) _ M,o 

J r] _1 



„(2,fll,l,+) 



f(2,Rl,C) = M.o 



dc-1 

E„(2,m,i,+) 

fc=0 

(2,m,i,c) 



d c -l 



fc=0 

Finally, for the set of probabilities P K K ' ' , we have: 

v {2,m,2) 1 V (2,R1) 1 yi(l,fll,2) V (2,R1) 

F £o ' ]y(2,Rl) K o " ]y(2,Rl) K o ^oto ' 

(2,m,2,+) _ 1 (i,m,2) (2,m) /, _ f (2,m,+)\ 
" ]y(2,Ri) K o F !Con \ J ) ' 

„(2,m,2,C) _ 1 ^(1,^1,2)^(2,^1) /-, _ f (2,Rl,C)\ 
P Ki - N (2,Rl) F Ki ^K in \ L J ) » 
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pjg* 1,2) = 0, j = 2,---,d v . 

The normalization factor j\K 2,m ) is used to make the set of parameters P^f* 1 ' 2 ^ a valid probability measure, 
and is calculated by: 

N (2,R1) = p(l,Rl,2) p (2,Rl) (l,Rl,2) p (2,Rl) ^ _ r(2,Rl,+)\ , p (l,Rl,2) p (2,Rl) ^ _ r(2,Rl,C)\ 

The probability that a variable node belongs to the support set and remains unverified after iteration 2, 
is calculated as follows: 

Q (3) = a {2) ( p (l,Rl,2) (2,R1) (l,m,2) (2,R1) , _ f (2,Rl, +)) + (1,R1,2) (2,R1) , _ f (2,Rl,C)}\ 
= a (2) JV (2,JU)_ 

For iteration 2, HR1-R2, we find the probability P^^ in the following. (For simplicity, some superscripts 
are omitted. They appear when there is a risk of ambiguity.) 



<lr 



' IV./; Ir, € K^\c t c U^ 2 -" ' :- 

J'=2 

= l - Pi[v e e K%' m \ f e = 0\v e e K?\ c e e\J A/}] 

3=2 

d c 

- Yi[v e e /Cf' m) ,/ e = 0\v e e K?\c e e 

3=2 

d c 

= 1 - Pr[v e e JC \v e e JC, c e g (J J\fj] 

3=2 

d c d c 

-Vi[v e e !Ct\v e elC,c e e |J A/}] Pr[/ e = 0\v e e 1Cf,v e elC,c e e [J A/}] 

3=2 3=2 

d c d c 

-Vx\v e e K%\v e elC,c e e |jA/}]Pr[/ e = 0\v e g K%,v e g /C,c e g |Ja/}], 



= 1 



3=2 3=2 

Yx\v e G /Co|^ e G K] Pr[c e G U^-^K e ^o,^e e /C] 



Pr[ Ce G U,t 2 ^ke e /C] 

_ Pr[^e G /C+|^ e G /C] Pr[c e G U^V>e € E+ v e G /C] _ + 

Pr[c e GU-= 2 A/> e G/C] 1 j 

_ PrK € g>e ^ g Pr[c e G U^V>e £ £ K] _ 

Pr[c e GU-l 2 ^ke/C] 1 j ' 

Pr[t; e G /Co|^ e G /C] Pr[c e G (jL-^k e /C ,^e G /C] 



1 - 



1 -Pr[c e eMk e £] 
PrK G Ej> e G /C] Pr[c e G U^-A/}^ G /C+ v e G /C] _ + 
l-Pr[c e GM|v e G/C] ^ ; ^ 
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Pr[v e e /CfK G /C]Pr[c e G (J^^K e /Cf, We g /C] /r f(2 rr 



1 -Pr[c e eNi\v e G /C] 



(1 - / (2 ' R1 ' C) ) , 



p(i,ja,2) p (2,m) 1 



1 _ p(2,m) 



p (i,m,2) p (2,iii) 

'Co JCo-fi 



rfp - 1 



„(i,iii,2) (2,m) 



p(2,m) 
oL — 1 



(1 _ < f(2,Hl,+)) 



1 _ p(2,Rl) 



(1 _ y?(2,fli,C)j _ 



Hence, the probability Pfy^ that a check node belongs to the set of check nodes M^jJf^ is calculated 
as follows: 



jK-Of^ra-^) . «=2.-.<t, *=o,-,i, i=o,...,<t-*. 



Note that the probability p( 2 jR1 ) is calculated by: 



„(2,in,i,+) „(2,m,i,c) 
p(2,fli) = M + M 



As we explain in the following, the evolution of the sets j\f^ R1 ' 1 ^ an d ^ 2 ' R1 ' 1 > C! ) [ s a ^ more involved. 
A variable node in the set K,Q R1 \ 1 < i < d v , has i neighboring check nodes in the set J\f^ 2 ' R1 ' 1,+ \ 
On the other hand, a variable node in the set , 1 < % < d v , has 1 neighboring check node in the 

set and i — 1 neighboring check nodes in the set j\f^ 2 ' R1 ^' + \ Now let us consider a check 

node c G J\f( 2 ' R1 ' 1,+ \ Suppose, c is neighbor to a variable node v G /C^'/* 1 ^. Variable node v is verified 
if and only if c belongs to the subset jV 1 ^ ) '' R1 ' 1 ' + ^. Hence, c is regrouped as a zero-valued check node 
if it belongs to the set J\f^Q R1 ' 1,+ \ Now, suppose c is neighbor to a variable node v' G U^^ot/^' or 
f ' G Ufl2 ^-lt'i^ 1 ^ Since the variable node v' is verified with probability 1, check node c is regrouped into 
the set of zero-valued check nodes with probability 1 as well. A similar argument holds true for the set 
of check nodes in j\f( 2 > R1 > 1 ' c ') an d variable nodes in K^ R1 \ Therefore, to regroup check nodes in the sets 
A/f* 1 ' 1 ^ and A/f'* 1 ' 1 '^, we need to divide them further based on their neighbors; i.e., whether or not 
they are neighbor to variable nodes in the sets K?^ R1 ^ and )C^ R1 \ respectively. 

We partition the set A/f'* 1 ' 1 '^ into subsets A/f' ra ' + ' 0) and A/f' R2 ' + ' F) . We also partition the set A/f'* 1 ' 1 '^ 
into subsets U[ 2,R2,C,0) and A/f' R2 ' c,F) . Check nodes in sets U[ 2 ' R2 ' + ' F) and A/f' R2,c,F) are neighbor to 
variable nodes in sets K^ RV> and JC^ R1 \ respectively. Any other check node in the set j\f^ 2 ' R1 ' 1 ' + \ not 
being part of the set j\f( 2 ' R2 > + < F ^ j s grouped into the set j\f( 2 ' R2 ' + >°\ Similarly, any other check node in 
the set X} 2 ' R1 ' 1,C) , not being part of the set A/f' R2 ' c ' F) is grouped into the set j\f( 2 < R2 < c >°\ 

Let p^^ 2 '" 1 "' ) an d p(?> R2 >+' F ) denote the probabilities that a check node belongs to sets j\f( 2 ' R2 ' + '°) an d 
respectively. Probabilities p^j.' R2 ' C '°^ an d p& R2 > c ' F ) are defined similarly. The calculation of 
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the probability P%f 2 ' + ' F) , < % < d c - 1, follows: 

P ( ^' + ' F) = Pr[ce^ R2 ' + ' F) ], 

= Pr[c G M% R1 ^] Pr[c G <f 2 ' + > F) \c G ^J m ' 1 ' +) ], 

= Pr[c G A/JJ* 1 ' 1 ^] PrK G fcftfV G A^'+U £ K (2) ], 

_ p r A ,( 2 ,m,i, +)l PrK G /CgfV G /C( 2 )] Pr[c e G A/g^Ve € gjgU G /C( 2 >] 
1 M J Pr[c e G<^KG^)] 

= Pr[c G A/iV ' ; ] jqj-^ , 



where, 



Hence, 



A = f>[, e G /Cgf U G M^ +) \v e G /C< 2 >], 
j'=i 

= E^M^ 11 , 

j'=i 

B = G /Cgf U G A/g^^k e K (2) ], 

J'=2 

,(!./>'] 2)^(2,m) 

3=2 



„(i,m,2) (2,i?i) 

V (2,R2,+,F) _ (2,R1,1,+) ^/C ot i 



Ei"S" M,1 C + Eo' - ^•"■"•C 



„(2,H2,+,0) _ „(2,m,l,+) „(2,fl2,+,F) 



Following a similar approach, we have: 



(2,R1) 

n (2,R2,C,F) _ n (2,Rl,l,C) 



9 (2,m) 



E"i 



n (2,R2,C,0) _ n (2,Rl,l,C) n (2,R2,C,F) 



Since check nodes in the sets A/i (2 " R2,+ '° ) , j\f( 2 ' R2 ' C '°\ U^ R%+ ' F \ and N§ m,C ' F) receive 4 verified 
messages at iteration 2, HR1-R2, all such check nodes are grouped into the set A/ ' . On the other 
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hand, check nodes in the sets 1 < i < d c — 1, are grouped into the sets 

H^- m,1,Cr> . After the regrouping, the probability P^ 2 ' 1 ^ that a check node belongs to the set N^f* 3 is 
calculated as follows: 



n (2,R2,l) T ,(2, J R1, 1)^(2, R2) , Q , n , 



i=k 
dc 



(2,^2,1,+) _ „(2,fll,l)„(2,fl2) • _ n , , 



i=2 



(2,R2,1,C) _ (2,R2,C,F) (2,R2,+,F) . - , 1 

^ M - %ij + ' J - ■ ■ ■ , «c - 1, 

D (2,JH,1) _ D (2,JU,1) (2,i?2,C,0) (2„R2,+,0) (2,i»,l)„(2„R2) . _ n , . 

1=2 

(2 i?2) 

As previously defined, the probability ' , is needed to find the probability that a zero-valued variable 
node is verified at this stage. This probability is calculated as follows: 

_ g Pr [c e e A/g^j PrK £ A( 2 ) |c e £ A/g^] 

1=1 e E pr ^ e ^r 2,l) i pr ^ e a(2) i^ e A/ "fi' i?2,i) ] 

i=0 j=l 



1 P Vo. 



■,(2,H2,1) / J_ 

>P ^ u 

d c d c — l 

" EE^f' 11 (| 



E 



i=0 j = l 

J' 



d c d c — l 
i=0 j=l 

Note that the denominator is indeed Pr[t> e £ A^ 2 ^] = P^. Hence, the probability p^' R2 ' 2 \ defined as 

f2 /?2 21 

the probability that an unverified zero- valued variable node belongs to the set A) ' ' ' , is calculated as 
follows: 

P ^ = ( d A {pt R2) y (i - P f^y-\ i =o,...,d v . 

Lastly, the probability that a variable node is zero-valued and remains unverified for iteration 3 is given 
by: 

D (3) _ D (2)„(2,fl2,2) 
^A — ^A ^A 

The analysis of an iteration £, £ > 3, is similar to that of iteration 2. The update rules for a generic 



iteration £, £ > 2 is given in subsection V-E 



65 



Appendix IV 
Details of Concentration Results 

A. Probability of Tree-Like Neighborhood 

Consider a particular variable node v. We enumerate Mg and Cg, respectively, the number of variable 
nodes and check nodes in Af£ e under the assumption that J\f 2i is tree-like, as follows: 

t 

Me = l + d v (dc - 1) $^(4 - ir\d v - if- 1 , 
%=x 

and 



c l = d v Y,(d c -iy- 1 (d v -iy- 1 . 



i=i 



Recall that m and n are the number of check nodes and the variable nodes, respectively. Fix £*, and let 
£ < t . Assuming that is tree-like, the probability that M„ e+1 is tree-like is lower bounded by 



m 

and assuming that A/^ +1 is tree-like, the probability that J\f^ e+2 is tree-like is lower bounded by 



1 - 



Mt ^M e+1 -M e 



n 



for sufficiently large n (see Appendix A in [35 1). Thus, the probability that J\f^* is tree-like is bounded 



from below by (lower bounding is done by using the chain rule) 

l _Mt L \ Mi ( 1 _^ Ci 
n J \ m 

for sufficiently large n, and since (1 - M^/n) Ml! * > (1 - M£/n), and (1 - C e */m) Ci * > (1 - C%/m) 
for large n, then 

Pr[A/T is not tree-like] < l-(l- M ^ ^ C ** 



n J \ m 



< — 5- -| 5- 

n m 
Ml+Cl(d c /d v ) 

n 

Taking 7 = M|» + Cf,(d c / d v ), we see that 7 is a constant not depending on n, and thus, 

PrLA/!r*is not tree-likel < -. 

n 
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B. Derivation of Bounds on the Difference between Two Consecutive Elements in Martingale Sequences 

Consider two realizations V and T" from the ensemble of all graphs, inputs, and weights. We consider 
two cases: 

• Case I: Realizations T' and T" are the same except for the value of a variable node. 

• Case II: Realizations T' and T" are the same except for the connection of two variable nodes v\ and v 2 
to two check nodes c 1 and c 2 ; i.e., T : c\ G M(v 1 ),c 2 G M(v 2 ) and T" : c 1 G A^(^2),c 2 G -M('Ui). 

• Case III: Realizations V and T" are the same except for one weight of an edge in the graph. 

Suppose the difference between the two realizations results in the difference of N(£) in verified variable 
nodes at iteration I. The goal in this appendix is to find an upper bound on N(£) for Genie, LM and SBB 
algorithms. Without loss of generality, suppose that realization V verifies less number of variable nodes 
compared to T". As we are seeking an upper bound, we assume the worst configuration for realization 
V and the best configuration for T". The realization V being the worst configuration implies that the 
variable node/edge/weight that is the difference between the two realizations V and T" results in no 
verification of variable nodes up to iteration I under consideration for V . On the other hand, we assume 
that the configuration in T" is so that the verification of a variable node at an iteration, results in the 
maximum number of verifications in the next iteration. With these configurations for V and T", in order 
to maximize the difference between the verified variable nodes between the two realizations at iteration 
I, we need to maximize the number of variable nodes that can be verified at iteration 0. 

Let the parameters E(£), D(£), Z(£) denote the difference in the number of variable nodes verified at 
iteration £ between the two realizations due to the ECN, D1CN, and ZCN verification rules, respectively. 
Therefore, N{£) = E{£) + D{£) + Z{£). 

In what fallows we find an upper bound for N(£) in the case of the Genie algorithm. A similar reasoning 
can be used for the other algorithms. 

1) Genie: The only verification rule applied to the Genie is D1CN. We find the maximum N(£) for each 
of the three cases discussed above. Focusing on case I, three possibilities exist: 

1) A variable node in T' is non-zero, while the same variable node in T" is zero. 

2) A variable node in T' is zero, while the same variable node in T" is non-zero. 

3) A variable node in both T' and T" is non-zero, but with two different values. 

In the first scenario, the worst configuration is such that the variable node under consideration remains 
unverified up to iteration L As the corresponding variable node is zero in realization T", this means that 
the d v neighboring check nodes have degrees smaller by one compared to their counterparts in realization 
V . The best configuration T" is then formed if each one of these d v check nodes has degree 1. Each 
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such check node results in the verification of one variable node with the D1CN rule. So, N(0) < d v . 

A variable node verified based on D1CN at iteration i — 1 (0 < i < £) can reduce the degree of at most 
d v — 1 check nodes. In the best case, each such check node has degree 1 which results in the verification 
of another variable node at iteration i. Therefore, we have: 

D{i) <D(i-l)(d v -l), 

which results in 

N(£) < d v (d v - if, e>o. 

In the second and third scenarios, considering the fact that T' is the worst configuration and that the 
realizations T' and T" have the same weighted graph and the same input vector (except for the variable 
node under consideration), the realization T" can not verify more variable nodes than T'. Therefore, in 
this case, N(£) = 0. This is assuming that no false verification happens in either realizations. 

Now we consider Case II. Possible scenarios for the values of v\ and v 2 are as follows: 

1) v i = 0, v 2 7^ (due to the symmetry, this is the same as Vi ^ 0, v 2 = 0), 

2) Vl = 0,v 2 = 0, 

3) Vl ^0,v 2 ^0. 

To make the worst realization, we assume that all the neighbors of c\ are zero-valued variable nodes, and 
we let C2 to be neighbor to only one other non-zero variable node, say t> 3 . In this realization, as c 2 has 
degree 2, then v 2 and v 3 can not be verified in iteration zero. However, when we switch the connections, 
ci and c 2 will both have degree 1. Therefore, both variable nodes v 2 and v 3 are verified based on the 
D1CN rule. So, in this scenario N(0) = 2. Again, to find the maximum number of variable nodes that 
can be verified in further iterations, we assume that the verification of each variable node at iteration i — \ 
results in the verification of d v — 1 other variable nodes at iteration i. We thus have: 

D(i) < D(i - l)(d v - 1), 

which results in 

N{€) < 2(d v - if, £>0. 

Due to the symmetry of the problem, the other two scenarios result in no difference in the number of 
verified variable nodes. 

For Case III, the weights are, by definition, non-zero. Thus, the change in the weight of an edge in the 
graph, has no effect on the recovery of variable nodes. 
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Based on the above discussions, we have the following upper bound for the Genie algorithm: 

N{£) < d v {d v - l) w , £ > 0. 

2) LM: This algorithm applies D1CN and ZCN verification rules in the first and second half-rounds of 
each iteration. Following the same steps as those for the Genie algorithm, one can show that the maximum 
N(£) between all possible configurations for cases I, II, and III is achieved when a variable node changes 
its value from a non-zero value to a zero value. With the same logic as in the Genie, d v variable nodes 
can be verified with the D1CN rule at iteration in T" that can not be verified in T'. We thus have 

N(0) = d v . 



When a variable node is verified based on D1CN at the first half-round of iteration % — 1, at most d v — 1 
check nodes can have a value equal to zero, each of which results in the verification of d c — 1 variable 
nodes with the ZCN rule in the second half-round of the same iteration. 

On the other hand, when a variable node is verified based on ZCN at the second half-round of iteration 
% — \, at most d v — 1 check nodes can have a degree one for the first half-round of the next iteration. So, 
d v — 1 variable nodes can be verified using the D1CN rule at iteration i. 

Putting these two steps together, we form the following recursive formulas: 

D(i) < Z(i - \){d v - 1), 

z{i) < D(i)(d v -i)(d c -i). 

Solving the recursions with the initial condition Z(0) = d v , we have: 

z(e) <d v (d v -i) 2e (d c -iy, 
D^Kdvidv-i^idc-iy- 1 , 

N{£) = Z{£) + D{£) < 2Z{£) = 2d v (d v - lf\d c - l) e . 



3) SBB: This algorithm applies all ZCN, D1CN, and ECN verification rules. In order to find the upper 
bound on N(£), we first find the number of variable nodes verified at a generic iteration i due to the 
verification of a variable node at iteration % — 1 based on each verification rule, separately. Then we find 
the maximum number of variable nodes that can possibly be verified at iteration zero based on each of 
the three Cases I, II, or III and use them as initial conditions to solve recursive formulas like the ones 
we saw for LM. 

Assume two check nodes with the same value result in a variable node being verified according to the 
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ECN rule. Therefore, the two check nodes result in the verification of 2d c — 2 variable nodes in total with 
ZCN verification rule in the next round of the same iteration. Also, the other d v — 2 adjacent check nodes 
face a reduction in the degree as well as change in the value. Therefore, they can result in the verification 
of (d v — 2)(d c — 1) variable nodes with ZCN in the next round of the same iteration, or the verification 
of d v — 2 variable nodes based on the D1CN or ECN in the next iteration. One may be able to find out 
the best combination of rules that results in the maximum number of verified variable nodes. However, 
as we are interested in finding an upper bound, we assume that all the events above can happen at the 
same time. 

If a variable node is verified according to the ZCN rule, the d v — 1 adjacent check nodes do not face a 
change in their values, and thus can not contribute to the verification of other variable nodes based on 
ECN or ZCN. Therefore, each such check node verifies 1 variable node based on the D1CN rule. 

When a variable node is verified according to the D1CN rule, d v — 1 check nodes face a reduction in 
degree as well as change in value. With the same reasoning as that used in the ECN case, each such 
check node results in the verification of d c — 1, 1, and 1 variable nodes based on the ZCN, D1CN, and 
ECN rules, respectively. 

Putting everything together, we have: 

E(i) < E(i - \){d v - 2) + D(i - l)(d v - 1), 

D(i) < Z(i - l)(d v - 1) + D(i - l)(d v - 1) + E(i - \){d v - 2), 

Z{{) < E{{){d v ){d c - 1) + D(i)(d v - 1)(4 - 1). 

Using the third inequality, we can rewrite the first two inequalities as follows: 

E(i) < E(i - l)(d v - 2) + D(i - l)(d v - 1) < d v E(i - 1) + d v D(i - 1), 
D(i) < E(i - 1) (d v (d v - 1)(4 - 1) + (d v - 2)) + D(i - l)(d v - 1) ((d v - 1)(4 - 1)) , 
< (d 2 v d c + d v )E{i - 1) + d 2 v d c D(i - 1). 

Looking for an upper bound, we assume that the inequalities are satisfied with equalities. Therefore, we 
have: 



E(i) = d v E(i - 1) + d v D(i - 1), 

D(i) = (d 2 v d c + d v )E(i - 1) + d 2 v d c D{i - 1) = d v d c E(i) + d v E(i - 1). 



70 



Replacing D(i — 1) of the first equality by its value from the second one, we get: 

E(i) = (4 + d 2 v d c ) E(i - 1) + d 2 v E(i - 2). 

This is a second order linear homogeneous recurrence relation. Using the characteristic polynomial 
technique, a closed form solution can be found. To find a more elegant solution, however, we derive 
the following upper bound on E(i). 

E(i) = (d v + d 2 v d c ) E(i - 1) + d 2 v E(i - 2) < (d v + d 2 v d c ) E(i - 1) + M±?^£(, _ 2 ). 
Considering the inequality as equality, and using the characteristic polynomial technique, we then obtain 

E(i) = K 1 (%d e +ld v y + K 2 (-^ 

for some K 1 and K 2 that depend on the initial conditions. We can simplify this even further as follows: 

E(i) < K x (d 2 v d c + 2d v )\ 

By replacing this upper bound in D(i) and Z(i), we get: 

D(i) < d v d c K x [d 2 v d c + 2d v ) 1 + d v K x (d 2 v d c + 2d v )^ 1 < 2d v d c K 1 (d 2 v d c + 2d v ) 1 , 
Z(i) < d v d c (E(i) + D(i)) = d v d c K x (d 2 v d c + 2d v ) 1 (1 + 244), 

and thus 

N{i) = E(i) + D(i) + Z(i) < (d v d c + 1)K X (d 2 v d c + 2d v f (1 + 244), 
<K 1 (d 2 v d c + 2d v ) i+2 , 

where K 1 is the maximum number of variable nodes that can be verified at iteration for the Cases I, 
II, and III. Once again, one can show that the maximum is achieved when a variable node changes value 
from a non-zero value to zero, and where 4 variable nodes are verified using the ZCN rule. Therefore, 
we have K\ — d v . Hence, 

N(£) < 4 (d 2 v d c + 2d v ) i+2 . 



